Email List: Xaustin-group-futures-lX
[All Lists]

Re: thread-private working directory

To: Harti Brandt <yyyyyy@xxxxxxxxxxxxxxxxxxx>
Subject: Re: thread-private working directory
From: Dave Butenhof <yyyyyyyyyyyyyy@xxxxxx>
Date: Fri, 01 Aug 2003 10:46:15 -0400
Cc: Alexander Terekhov <yyyyyyyy@xxxxxxxxxx>, yyyyyyyyyyyyyyyyyyyyyy@xxxxxxxxxxxxx
Organization: Hewlett-Packard Company
References: <OFB7D492BD.64512203-ONC1256D75.0048EE46-C1256D75.00492B68@de.ibm.com> <20030801152245.P4685@beagle.fokus.fraunhofer.de> <3F2A6FDC.8090208@hp.com> <20030801160422.E4685@beagle.fokus.fraunhofer.de>
Harti Brandt wrote:

On Fri, 1 Aug 2003, Dave Butenhof wrote:

DB>Harti Brandt wrote:
DB>
DB>>On Fri, 1 Aug 2003, Alexander Terekhov wrote:
DB>>
DB>>AT>Harti Brandt wrote:
DB>>AT>[...]
DB>>AT>> I wonder how one would implement a per-thread current directory
DB>>AT>> in an environment where there is no 1:1 mapping between user
DB>>AT>> threads and kernel threads.
DB>>AT>
DB>>AT>Something similar to (but probably a bit less convoluted ;-) )
DB>>AT>a per-thread signal mask, I guess.
DB>>AT>
DB>>AT>> While I could think of ways, they seem too convoluted to me.
DB>>
DB>>Given that I have only one per-process current working directory this
DB>>would force me to intercept all systemcalls that contain pathnames and
DB>>fiddle with these, doesn't it?
DB>>
DB>>
DB>Yes, very likely, if you insisted on doing this on a kernel that's not
DB>designed to support it. Of course you'd also need to synchronize your
DB>kernel threads to be sure that only one could hit such a point at any
DB>time since you can have only one process kernel state at a time.
DB>
DB>This doesn't really even have anything to do with whether you've got 1:1
DB>or M:N mapping of threads. If your kernel has a CWD per process and
DB>you're trying to make it per-thread, you're in trouble.

That's just what I was trying to get. When Posix requires me to re-design
my Unix kernel to fit into Posix's thread model, something is wrong. With
Posix, not the kernel. If I have 1:1 mapping things may actually be easy,
because I can move the cwd from the per-process structure to the
per-process structure. If I have a M:N mapping, I must handle this in user
space and repeating half of namei() in the pthreads library is a clear
sign of wrong design (of the specification).

POSIX doesn't require that any implementation support M:N, and many don't. As an implementor, that's your decision. If you aren't willing to do the work to support M:N, then presumably your decision is made, and that's fine. If you prefer not to bother but your customers insist, well, that's tough, but you can't blame it on POSIX. ;-)

But of course POSIX conformance does require that your kernel be [re]designed to implement the standard. But there's a critical semantic distinction here: "POSIX" doesn't require anything; only if you CHOOSE to CONFORM to POSIX do you place requirements on your implementation.

Given POSIX history, a new per-thread CWD would be an OPTION in POSIX and XSI, and you would choose whether to implement that option. This is indeed the principle justification for the confusing and sometimes annoying set of optional features -- because supporting them can often require substantial development investment, and you shouldn't be required to do that if it's not important to you and your customers. Even _POSIX_THREADS is a POSIX option that's not required for POSIX (or even POSIX 1003.1-2003) conformance. The Open Group chose to make this particular option mandatory for UNIX branding for UNIX 98 and beyond, but even so you're not required to seek that branding.

However, for anyone who DOES implement an option, POSIX provides a common syntax and (within limits) semantics for that option so that portable programs can exploit the feature where it exists.

DB>But there's nothing unique about this. If you want to have multiple
DB>threads within a process then each needs its own registers, including
DB>stack and instruction pointers. In POSIX, each has its own signal mask.
DB>If you want each to have its own "security persona", you need to store
DB>that. If each has its own CWD, you need to store that, too. To do M:N
DB>you either need to explicitly context switch each piece of data
DB>(expensive when they're kernel state), or you need abstracted
DB>user-managed contexts. In our case, the kernel gets "all this stuff"
DB>through the thread register. Adding a per-thread CWD would be a trivial
DB>matter of adding the field to the "shared area".

So that comes down to the question that I saw earlier in this thread:
if a thread has everyting a process has, what's the difference between
them? I cannot pin it down, but I have the feeling that if you find
yourself re-implementing the kernel in libpthreads there is something
with the model. If you find that you can globally rename process -> thread
in the kernel, something is wrong with the model. Anywhere there is a fine
border between things a thread should have and what it should not have...

There's always a fine line. Yes, calling a "process" a "thread" doesn't enable anyone to do something they couldn't do before. It's easy to label and characterize the extremes; it's the middle ground that gets fuzzy.

There were many battles fought to draw the line that's currently in the standard. Many of them were fought multiple times, often with slightly different casts. What's there is the result of often difficult compromises. The signal model is a good example, because several signal models were proposed, developed, written up, and completely trashed (over many badly flame-singed heads) before Nawaf Bitar finally came up with the revolutionary "grand unified signal compromise" that remains largely intact in the current standard. Not because it's perfect, but because it was something everyone could understand, implement, use, and live with. (And I'm sure the fact that everyone in the working group at the time was sick and tired of the battle didn't hurt.)

Fork was another example, because it's almost a fundamental and universal contradiction with a shared address space -- yet so deeply ingrained in POSIX/UNIX culture that it couldn't possibly be fixed or replaced. The posix_spawn function was a long time coming, and probably will never be widely used because it came so late.

And yet, making CWD per-thread doesn't require "re-implementing the kernel in libpthread". It simply requires cooperation. Libpthread isn't just an add-on library; it's a part of the system, just like libc. Inseparable and deeply intertwined. An extension of the kernel into user space, really. And in a 1:1 implementation it need be no more than a set of syscall stubs. Although even on a 1:1 implementation it usually IS a lot more, because many operations can be optimized knowing you're already in user space, without crossing the protection boundary into the kernel. You can optimize mutexes by taking uncontended locks in user-space, for example. You might choose to optimize chdir() by putting the CWD into user space and avoiding a kernel call; but nobody requires that. If you're building a kernel to support M:N schedulers, then you probably want to do something like our "shared area", so the whole block of thread state (registers, CWD, signal mask, and whatever else) gets "switched" simply by the act of changing the processor's "thread register" (or various equivalents). But the standard, again, doesn't require any of this.

Methinks that the CWD is and should stay a per-process attribute, as is
the root directory.

That was of course the decision of the working group at the time! But not all decisions are permanent. I do think that we ought to be a bit more conservative about radical changes now, though. In the original working group we could throw out a signal model and start from scratch because there were no implementations. (Even though DCE threads existed, the standard said "thou shalt not implement", so there WERE no implementations! ;-) ) Now, such a change would be a radical and unreasonable burden on too many people, for too little benefit to anyone. But THAT is an opinion, not a law...

--
/--------------------[ yyyyyyyyyyyyyy@xxxxxx ]--------------------\
| Hewlett-Packard Company Tru64 UNIX & VMS Thread Architect |
| My book: http://www.awl.com/cseng/titles/0-201-63392-2/ |
\----[ http://homepage.mac.com/dbutenhof/Threads/Threads.html ]---/


<Prev in Thread] Current Thread [Next in Thread>