Austin Group Minutes of the 27 May Teleconference Austin-213 Page 1 of 1
Submitted by Andrew Josey, The Open Group. May 28, 2004
Attendees
Andrew Josey, The Open Group
Don Cragun , Sun, PASC OR
Ulrich Drepper, Red Hat
Nick Stoughton, USENIX, WG15 OR
Glenn Fowler, AT&T
Apologies
Dave Butenhof, HP
Mark Brown, IBM, TOG OR
Joanna Farley, Sun
Draft Status
---------------
No new status to report.
The hardcopy run is due back from the printers, there
are two copies left remaining.
http://www.opengroup.org/bookstore/catalog/t041y.htm
Defect Report Processing
-------------------------
The group picked up on the latest batch of defect reports,
which are available at the following URL:
http://www.opengroup.org/austin/aardvark/latest/
XBD ERN 7 , BRE nested subpatterns (XBD BRE defn vs regcomp()) OPEN
Further investigations during the week led to a number of responses,
with a proposal from Glenn Fowler. A summary is follows below:
-start summary
Observations of historical sed implementation behavior need to
determine if the sed implementations use POSIX regcomp()/regexec() and
return '' because the underlying regcomp()/regexec() returns
''.
The correlation between sed and regcomp()/regexec() is important
because regcomp()/regexec() implementations that return '' are not
compliant, and have not been since at least 1997. If the
regcomp()/regexec() implementations for those sed implementations using
regcomp()/regexec() were fixed to comply with the standard then those
sed implementations would return '', and the 'historical sed'
argument for those sed implementations would become the 'historical
regcomp()/regexec()' argument.
Assuming the intent of the standard, stated or not, would be to keep
the BRE and regcomp()/regexec() descriptions consistent, the 2001 BRE
addition should be fixed. Otherwise there is a chance that this
addition, as it stands, could be used to invalidate (or 'unspecify')
portions of the regcomp()/regexec() description, with affects well
beyond the scope of sed.
As ERN-7 suggests, one solution to the discrepancy would be to fix
the 2001 addition to 9.3.6 (item 3.) to include the affects of nested
subexpressions on back-reference expressions. This seems like a
reasonable course of action since it corrects a problem introduced
in 2001 and leaves the regcomp()/regexec() description (which
predates the 2001 addition) intact.
The proposed change is based on the regcomp()/regexec() description
of the regexec() pmatch array.
[* original 9.3.6 (item 3.) text *]
3.The back-reference expression '\n' shall match the same (possibly
empty) string of characters as was matched by a subexpression enclosed
between "\(" and "\)" preceding the '\n' . The character 'n' shall be a
digit from 1 through 9, specifying the nth subexpression (the one that
begins with the nth "\(" from the beginning of the pattern and ends
with the corresponding paired "\)" ). The expression is invalid if less
than n subexpressions precede the '\n'.
[* replace this text *]
For example, the expression "\(.*\)\1$" matches a line consisting of
two adjacent appearances of the same string, and the expression
"\(a\)*\1" fails to match 'a' . When the referenced subexpression
matched more than one string, the back-referenced expression shall
refer to the last matched string. If the subexpression referenced by
the back-reference matches more than one string because of an asterisk
( '*' ) or an interval expression (see item (5)), the back-reference
shall match the last (rightmost) of these strings.
[* with this text *]
The string matched by a contained subexpression shall be within the
string matched by the containing subexpression. If the containing
subexpression does not match, or if there is no match for the contained
subexpression within the string matched by the containing subexpression
then back-reference expressions corresponding to the contained
subexpression shall not match. When a subexpression matches more than
one string, a back-reference expression corresponding to the
subexpression shall refer to the last matched string. For example, the
expression "^\(.*\)\1$" matches lines consisting of two adjacent
appearances of the same string, the expression "\(a\)*\1" fails to
match 'a', the expression "\(a\(b\)*\)*\2" fails to match 'abab', and
the expression "^\(ab*\)*\1$" matches 'ababbabb' but fails to match
'ababbab'.
Also, I ran ERN-7 by Doug McIlroy and he noted that the sed substitute
command description only specifies what is substituted when a backreference
expression refers to a subexpression that matches:
The characters "\n", where n is a digit, shall be replaced by the
text matched by the corresponding backreference expression.
This should probably be revised to handle cases where the backreference
expression does not match:
The characters "\n", where n is a digit, shall be replaced by the
text matched by the corresponding backreference expression, or by
the empty string if the the corresponding backreference expression
does not match.
-end summary
The feeling is that an interpretation will be needed, the standard is
unclear and we need to clarify for the next revision.
The relevant sections of the standard are as follows:
XBD p 174 6.3.6
XCU p 846 substitute command
Glenn is taking an action to review other uses of BRE within the
standard and report back to the mailing list.
This is thus being left open until the next meeting.
XBD ERN 11 key_t should be arithmetic type ? Accept as marked below
The group agreed with the proposal that key_t be changed in
to just be an arithmetic type to be consistent with XSH section 2.12
Data Types, and noted that any implementation that
has key_t as a pointer would be broken by this change.
It was agreed to put this down the interpretations track. The standard
is inconsistent and no conformance distinction can be made for the
current standard.
The interpretation should include the recommendation that in a future
revision we include the change as in the ERN. The proposed change
would correct an inconsistency which has been around since at least XPG4.
XBD ERN 12 option handling in unistd.h Accept as Marked below
This should be treated consistently with XBD ERN 9 and
go down the interpretations track as part of the same interpretation.
Arising recommendations for a future revision are as follows:
In section: Constants for Options and Option Groups
Delete in paragraph 1:
"If these are undefined, the fpathconf(), pathconf(), or sysconf()
functions can be used to determine whether the option is provided
for a particular invocation of the application."
Change in paragraph 2 from:
If a symbolic constant is defined with the value -1, the option is not
supported.
To:
If a symbolic constant is not defined or is defined with the value -1, the option is not
supported.
Change in paragraph 4 from:
The application can check at runtime to see whether the option is supported by
calling fpathconf (),pathconf (),orsysconf( ) with the indicated name
parameter.
To:
The application can check at runtime to see whether the option is supported
for a particular invocation of the application by
calling fpathconf (),pathconf (),orsysconf( ) with the indicated name
parameter.
XBD ERN 13 sched.h option language Accept as Marked below
It was agreed to go with the option 1 in the proposal which is
to remove the lead in so that the following sentence
would start "The sched_param structure...." :
"In addition, if _POSIX_SPORADIC_SERVER or _POSIX_THREAD_SPORADIC_SERVER
is defined, the sched_param structure defined in shall contain
the following members in addition to those specified above:"
This is needed since the option constants may not be defined. This
way the margin marker notation would show the optional nature
of this requirement.
Next Steps
-----------
Andrew will update the aardvark reports with the latest inbound
defect reports.
Andrew will generate some new interpretations.
There are a number of open action items outstanding:
1. Don Cragun Pathname Resolution proposal
2. Larry Dwyer system() and threads
3. Joerg Schilling wording for XCU ERN 1 pax
4. Further investigation for XCU ERN 18.
The next teleconference call is scheduled for June 10 2004