Email List: Xaustin-review-lX
[All Lists]

Defect in XCU find

To: austin-review-l@xxxxxxxxxxxxx
Subject: Defect in XCU find
From: ebb9@xxxxxxx
Date: Sat, 8 Sep 2007 18:56:50 +0100 (BST)
        Defect report from : Eric Blake , N/A

(Please direct followup comments direct to austin-group-l@xxxxxx)

@ page 452 line 17500 section find comment {ebb.find}

Problem:

Edition of Specification (Year): 2004

Defect code :  3. Clarification required

The specification for find is unclear what happens when path operands
end with slash(es).  This issue was sparked by a discussion about the
GNU implementation of find, https://savannah.gnu.org/bugs/?20970.

The first question has to do with whether each filename subjected to
the operand_expression should be canonicalized to the minimum number
of slashes, or whether they are permitted to have multiple slashes.

Many existing implementations preserve extra slashes passed in a path
operand, but ensure that all additional slashes added during path
traversal are minimized, and omit trailing slashes on directory
names.  For example:

$ find foo///
foo///
foo///bar
foo///bar/blah

But this does not appear to be specified anywhere.  Additionally,
keeping the extra slashes may make the difference between a
successful operation and an ENAMETOOLONG failure on the resulting
name.

Blindly stripping all trailing slashes has a different problem, since
if 'foo' is a symlink to a directory, 'foo' and 'foo/' mean different
things, so a user performing 'find foo/' should be given 'foo/', not
'foo', in the output.  Perhaps implementations should be allowed to
strip redundant slashes, but be required to not strip all slashes,
although the proposal below merely codifies current practice of
preserving all slashes provided in the path operands and minimizing
the slashes encountered during traversal.

The second question has to do with -name behavior.  The specification
for -name (line 17535) requires that the basename of the file being
examined be matched against the pattern.  Both basename(1) (XCU line
7332) and basename(3) (XSH line 5120) are clear that trailing slashes
do not factor in to a basename; likewise, the definition for basename
in XBD 3.40, by reference to the definition to filename, implies no
slashes.  This would imply that 'find "$path" -name \*/' must never
print anything, unless $path happens to resolve to / or //, because
no other filenames can have a basename with a trailing slash.
However, many current find implementations preserve the trailing
slash of command-line path arguments, resulting in this strange
behavior:

$ find foo -name foo
foo
$ find foo -name foo/
$ find foo/ -name foo
$ find foo/ -name foo/
foo/
$

Whereas an implementation that did a true basename match, according
to the definition of basename, should do:

$ find foo -name foo
foo
$ find foo -name foo/
$ find foo/ -name foo
foo/
$ find foo/ -name foo/
$

The proposal below leaves this area unspecified, although it would
also be possible to declare the standard as clear and existing
implementations as buggy for not stripping the trailing slash before
performing the pattern match.

Finally, GNU find has recently added a warning message when the -name
operand is used with a pattern containing a slash, but exits with
status 0 implying success:

$ find foo -name '*/*'
find: warning: Unix filenames usually don't contain slashes (though
pathnames do).  That means that '-name `*/*'' will probably evaluate
to false all the time on this system.  You might find the
'-wholename' test more useful, or perhaps '-samefile'.
Alternatively, if you are using GNU grep, you could use 'find ...
-print0 | grep -FzZ `*/*''.

Is such an extension forbidden by the current POSIX rules, because
a diagnostic was printed to stderr without affect exit status?  The
proposal below makes the use of a slash in the -name pattern
unspecified (so that the GNU warning no longer sparks questions).

An interpretation is requested, so that this can also be adjusted in
the next 200x draft.  Additionally, the 200x draft introduces the new
-path expression.  Should 'find foo// -path foo/' be required to
print nothing (since neither "foo" nor "foo//" match the pattern
foo/)?  Should 'find foo// -path foo//\*' print nothing, only print
the command-line spelling of foo// but no files or subdirectories, or
should it print all files in the hierarchy?


Action:

At line 17503, add the following sentences to the Description paragraph:

Each path operand shall be evaluated with the spelling provided,
including all trailing slashes; all other files encountered in the
hierarchy shall consist of the concatenation of the path operand, a
slash if the path operand did not end in one, and the filename relative
to the path operand, where the relative portion contains no dot or
dot-dot components, no trailing slashes, and only single slashes
between pathname components.

At line 17538, add the following sentences to the -name paragraph:

Behavior is unspecified if pattern contains a slash character.  Also,
it is unspecified whether the -name operand strips trailing slashes
from a path operand before performing the pattern match.

<Prev in Thread] Current Thread [Next in Thread>