Defect report from : Petr Baudis , KAM, Charles University, Prague
(Please direct followup comments direct to yyyyyyyyyyyyyy@xxxxxxxxxxxxx)
@ page 0 line 0 section uniq editorial {no idea}
Problem:
Edition of Specification (Year): 2004
Defect code : 2. Omission
http://www.opengroup.org/onlinepubs/000095399/utilities/uniq.html
In the OUTPUT FILES section describing the -c output format string, it is not
specified that the number of occurences can be preceded by an arbitrary number
of spaces. That is the current behaviour of GNU uniq as well as other UNIX uniq
tools and it is also shown so in the informative part, EXAMPLES section.
This is important to specify since that means you e.g. cannot simply use cut -d
' ' to extract the number of occurences (which the format string would lead you
to believe) but need to do further processing on uniq output first.
Action:
(i) First option: Change the format string to "%s %d %s", <arbitrary number of
whitespaces>, <number of duplicates>, <line>. This is the behaviour of the uniq
tools I know about, but it is sharply backwards incompatible with the current
state. The advantage is that you can post-process the output with | tr -s ' '
and then the fields will be at a fixed position for cut -d ' ' to consider.
(ii) Second option: Change the format string to "%s%d %s", <arbitrary number of
whitespaces>, <number of duplicates>, <line>. This is backwards compatible but
you need to preprocess with | sed 's/^ *//' (on a second thought, this might be
a better solution (also to recommend in the informative section?) anyway).
(iii) Third option: The existing implementations are declared as
non-conformant. In that case, the EXAMPLES informative section should be
corrected not to contain the leading spaces.
|