Defect report from : Paul Eggert , UCLA
(Please direct followup comments direct to yyyyyyyyyyyyyy@xxxxxxxxxxxxx)
@ page 177 line 6924 section awk objection {20040504a}
Problem:
Edition of Specification (Year): 2004
Defect code : 1. Error
The C99 standard introduced the notion of hexadecimal floating
constants, and since the POSIX "awk" specification refers to C99,
POSIX "awk" is required to support them. However, the POSIX
specification was not updated with this C99 change in mind, as as a
result POSIX "awk" is required to support hexadecimal numbers in some
contexts but not others. This should be fixed, either by requiring
support for hexadecimal numbers everywhere, or disallowing it
everywhere.
Here's the problem. XCU page 177 lines 6924-6925 contains this
restriction:
a. An integer constant cannot begin with 0x or include the
hexadecimal digits 'a', 'b', 'c', 'd', 'e', 'f', 'A', 'B' 'C',
'D', 'E', or 'F' .
However, restriction (a) contradicts the awk rationale, which says
(XCU page 185 lines 7293-7295):
The description of numeric string processing is based on the
behavior of the atof() function in the ISO C standard. While it is
not a requirement for an implementation to use this function, many
historical implementations of awk do.
Restriction (a) evidently was inspired by C89, where atof() did not
parse hexadecimal numbers. However, in C99 atof() must parse
hexadecimal numbers like "0xa" and "0xap0". Hence the rationale no
longer matches the text of the standard.
The "awk" specification does not contain any restrictions against
hexadecimal floating constants. As a result of this
inconsistency, a conforming awk implementation must treat the
hexadecimal floating constant "0xap0" as a number equal to 10,
but "awk" is not allowed to treat the hexadecimal integer constant
"0xa" as a number equal to 10 -- even though atof() does so.
Also, restriction (a) causes an inconsistency with another POSIX
requirement (XCU page 157 lines 6050-6053):
A string value shall be converted to a numeric value by the
equivalent of the following calls to functions defined by the ISO C
standard:
setlocale(LC_NUMERIC, "");
numeric_value = atof(string_value);
Hence, for example, the Awk expression ("0xa" + 0 == 10) must evaluate
to 1, even though restriction (a) means that the similar expression
(split("0xa", a) && a[1] == 10) must evaluate to 0 because "0xa" is
not considered to be a numeric string.
I see three possible fixes:
1. The standard is correct as-is. Conforming "awk" implementations
must parse hexadecimal floating constants and must reject
hexadecimal integer constants, and they must parse numeric
strings differently from strings explicitly converted to numbers.
(If this alternative is chosen, the rationale should explain this.)
2. The intent was for "awk" to disallow hexadecimal numbers; add
more restrictions that disallow hexadecimal floating constants.
3. The intent was for "awk" to use atof(), so remove the restriction
disallowing hexadecimal integer constants.
(1) is entirely unsatisfactory, as it's not internally consistent
and disagrees with the rationale.
(2) is internally consistent, but disagrees with the rationale.
(3) is internally consistent and agrees with the rationale. It can be
accomplished by removing restriction (a).
Action:
Remove the following text from XCU page 177 lines 6924-6925.
a. An integer constant cannot begin with 0x or include the
hexadecimal digits 'a', 'b', 'c', 'd', 'e', 'f', 'A', 'B' 'C',
'D', 'E', or 'F' .
Append the following text to the awk rationale, after XCU page 185
line 7300:
Historical implementations of awk did not parse hexadecimal integer
or floating constants like "0xa" and "0xap0". Because C99 required
support for these constants in atof(), support for them is now
required in awk. This is a silent change to the awk language: for
example, the expression ("0xap0" + 0) formerly returned 0, but now
returns 10. Due to an oversight, the 2001 through 2004 editions of
this standard required support only for hexadecimal floating
constants, but this edition has corrected this to require support
for hexadecimal integer constants as well.
|