I wrote, on 29 Oct 2009:
>
> Eric Blake <ebb9@byu.net> wrote, on 27 Oct 2009:
> >
> > I'm hoping, along with Joerg, that we can modify the proposed wording of
> > 167 to explicitly allow modification of environ itself [...]
> This seems like a workable compromise. There is no teleconference
> today, but if at least two of the ORs reply to this mail saying they
> would support this modification, then I would be happy to put together
> an updated set of changes for discussion in next week's teleconference.
Two of the ORs indicated that they would support the modification.
Below is the new set of changes I have come up with. One problem
that occurred to me is that when the implementation detects that
environ has changed, it would have to treat all the strings in
the new environment as "volatile" (as if they had been added
by putenv()), in case the application overwrites any of the
strings later on. I have tried to circumvent that by explicitly
allowing the implementation to copy the strings to a new array
and assign environ to point to it, and warning applications not
to rely on those strings remaining part of the environment.
Changes to XBD...
At page 173 line 5476 section 8.1, change:
manipulating the environ variable
to:
assigning a new value to the environ variable
Add a new paragraph after line 5478:
If the application modifies the pointers to which environ
points, the behavior of all interfaces described in the System
Interfaces volume of POSIX.1-2008 is undefined.
Changes to XSH...
Remove the undefined behavior note in getenv (line 33856), setenv
(line 59347) and unsetenv (line 68256). The change history should
note that these are no longer needed because XBD 8.1 says the
behaviour of all interfaces is undefined.
Modify the rationale for getenv on page 1009, line 33885 from:
Conforming applications are required not to modify environ
directly, but to use only the functions described here to
manipulate the process environment as an abstract object. Thus,
the implementation of the environment access functions has
complete control over the data structure used to represent the
environment (subject to the requirement that environ be
maintained as a list of strings with embedded <equals-sign>
characters for applications that wish to scan the environment).
This constraint allows the implementation to properly manage
the memory it allocates, either by using allocated storage for all
variables (copying them on the first invocation of setenv() or
unsetenv()), or keeping track of which strings are currently in
allocated space and which are not, via a separate table or some
other means. This enables the implementation to free any allocated
space used by strings (and perhaps the pointers to them) stored in
environ when unsetenv() is called. A C runtime start-up procedure
(that which invokes main() and perhaps initializes environ) can
also initialize a flag indicating that none of the environment has
yet been copied to allocated storage, or that the separate table
has not yet been initialized.
In fact, for higher performance of getenv(), the implementation
could also maintain a separate copy of the environment in a data
structure that could be searched much more quickly (such as an
indexed hash table, or a binary tree), and update both it and the
linear list at environ when setenv() or unsetenv() is invoked.
to:
Conforming applications are required not to directly modify the
pointers to which environ points, but to use only the setenv(),
unsetenv() and putenv() functions, or assignment to environ
itself, to manipulate the process environment. This constraint
allows the implementation to properly manage the memory it
allocates. This enables the implementation to free any space it
has allocated to strings (and perhaps the pointers to them)
stored in environ when unsetenv() is called. A C runtime start-up
procedure (that which invokes main() and perhaps initializes
environ) can also initialize a flag indicating that none of the
environment has yet been copied to allocated storage, or that the
separate table has not yet been initialized. If the application
switches to a complete new environment by assigning a new value
to environ, this can be detected by getenv(), setenv(), unsetenv()
or putenv() and the implementation can at that point reinitialize
based on the new environment. (This may include copying the
environment strings into a new array and assigning environ to
point to it.)
In fact, for higher performance of getenv(), implementations
that do not provide putenv() could also maintain a separate copy
of the environment in a data structure that could be searched
much more quickly (such as an indexed hash table, or a binary
tree), and update both it and the linear list at environ when
setenv() or unsetenv() is invoked. On implementations that do
provide putenv(), such a copy might still be worthwhile but
would need to allow for the fact that applications can directly
modify the content of environment strings added with putenv().
For example, if an environment string found by searching the
copy is one that was added using putenv(), the implementation
would need to check that the string in environ still has the
same name (and value, if the copy includes values), and whenever
searching the copy produces no match the implementation would
then need to search each environment string in environ that
was added using putenv() in case any of them have changed their
names and now match. Thus each use of putenv() to add to the
environment would reduce the speed advantage of having the copy.
After page 772 line 25712 section exec, add two new paragraphs:
Applications can change the entire environment in a single
operation by assigning the environ variable to point to an array
of character pointers to the new environment strings.
After assigning a new value to environ, applications should
not rely on the new environment strings remaining part of the
environment, as a call to getenv(), [XSI]putenv(),[/XSI]
setenv(), unsetenv() or any function that is dependent on an
environment variable may, on noticing that environ has changed,
copy the environment strings to a new array and assign environ
to point to it.
Any application that directly modifies the pointers to which the
environ variable points has undefined behavior.
At page 779 line 25989 section exec, change:
The environ array should not be accessed directly by the
application.
The new process might be invoked in a non-conforming environment
if the envp array does not contain implementation-defined
variables required by the implementation to provide a
conforming environment. See the _CS_V7_ENV entry in <unistd.h>
and confstr() for details.
to:
When assigning a new value to the environ variable, applications
should ensure that the environment to which it will point contains
at least the following:
a. Any implementation-defined variables required by the
implementation to provide a conforming environment. See the
_CS_V7_ENV entry in <unistd.h> and confstr() for details.
b. A value for PATH which finds conforming versions of all
standard utilities before any other versions.
The same constraint applies to the envp array passed to execle()
or execve(), in order to ensure that the new process image is
invoked in a conforming environment.
At page 1717 line 54880 section putenv, after:
The setenv() function is preferred over this function.
add (as part of the same paragraph):
One reason is that putenv() is optional and therefore less
portable. Another is that using putenv() can slow down
environment searches, as explained in the RATIONALE for getenv().
At page 1717 line 54882 section putenv, change:
The standard developers noted that putenv() is the only function
available to add to the environment without permitting memory
leaks.
to:
Refer to the RATIONALE section in [xref to setenv()].
Add to the end of the setenv() rationale (P1858 L59381):
See also the RATIONALE section in [xref to getenv()].
--
Geoff Clare <g.clare@opengroup.org>
The Open Group, Thames Tower, Station Road, Reading, RG1 1LX, England
|