_____________________________________________________________________________ Notice: This is an unapproved draft interpretation. Use at your own risk. _____________________________________________________________________________ Austin Group Interpretation reference 1003.1-2001 #186 _____________________________________________________________________________ Interpretation Number: 186 Topic: find paths ending with slashes Relevant Sections: XCU find Austin Group Interpretation Request: ------------------------------------ Date: Sat, 8 Sep 2007 18:56:50 +0100 (BST) ------------------------------------------------------------------------ 7 Defect Report concerning (number and title of International Standard or DIS final text, if applicable): The Shell and Utilities Volume of IEEE Std 1003.1-2001 ------------------------------------------------------------------------ 8 Qualifier (e.g. error, omission, clarification required): 3. Clarification required ------------------------------------------------------------------------ 9 References in document (e.g. page, clause, figure, and/or table numbers): Page: 452 Line: 17500 Section: find XCUbug2.txt Enhancement Request Number 167 ------------------------------------------------------------------------ 10 Nature of defect (complete, concise explanation of the perceived problem): The specification for find is unclear what happens when path operands end with slash(es). This issue was sparked by a discussion about the GNU implementation of find, https://savannah.gnu.org/bugs/?20970. The first question has to do with whether each filename subjected to the operand_expression should be canonicalized to the minimum number of slashes, or whether they are permitted to have multiple slashes. Many existing implementations preserve extra slashes passed in a path operand, but ensure that all additional slashes added during path traversal are minimized, and omit trailing slashes on directory names. For example: $ find foo/// foo/// foo///bar foo///bar/blah But this does not appear to be specified anywhere. Additionally, keeping the extra slashes may make the difference between a successful operation and an ENAMETOOLONG failure on the resulting name. Blindly stripping all trailing slashes has a different problem, since if 'foo' is a symlink to a directory, 'foo' and 'foo/' mean different things, so a user performing 'find foo/' should be given 'foo/', not 'foo', in the output. Perhaps implementations should be allowed to strip redundant slashes, but be required to not strip all slashes, although the proposal below merely codifies current practice of preserving all slashes provided in the path operands and minimizing the slashes encountered during traversal. The second question has to do with -name behavior. The specification for -name (line 17535) requires that the basename of the file being examined be matched against the pattern. Both basename(1) (XCU line 7332) and basename(3) (XSH line 5120) are clear that trailing slashes do not factor in to a basename; likewise, the definition for basename in XBD 3.40, by reference to the definition to filename, implies no slashes. This would imply that 'find "$path" -name \*/' must never print anything, unless $path happens to resolve to / or //, because no other filenames can have a basename with a trailing slash. However, many current find implementations preserve the trailing slash of command-line path arguments, resulting in this strange behavior: $ find foo -name foo foo $ find foo -name foo/ $ find foo/ -name foo $ find foo/ -name foo/ foo/ $ Whereas an implementation that did a true basename match, according to the definition of basename, should do: $ find foo -name foo foo $ find foo -name foo/ $ find foo/ -name foo foo/ $ find foo/ -name foo/ $ The proposal below leaves this area unspecified, although it would also be possible to declare the standard as clear and existing implementations as buggy for not stripping the trailing slash before performing the pattern match. Finally, GNU find has recently added a warning message when the -name operand is used with a pattern containing a slash, but exits with status 0 implying success: $ find foo -name '*/*' find: warning: Unix filenames usually don't contain slashes (though pathnames do). That means that '-name `*/*'' will probably evaluate to false all the time on this system. You might find the '-wholename' test more useful, or perhaps '-samefile'. Alternatively, if you are using GNU grep, you could use 'find ... -print0 | grep -FzZ `*/*''. Is such an extension forbidden by the current POSIX rules, because a diagnostic was printed to stderr without affect exit status? The proposal below makes the use of a slash in the -name pattern unspecified (so that the GNU warning no longer sparks questions). An interpretation is requested, so that this can also be adjusted in the next 200x draft. Additionally, the 200x draft introduces the new -path expression. Should 'find foo// -path foo/' be required to print nothing (since neither "foo" nor "foo//" match the pattern foo/)? Should 'find foo// -path foo//\*' print nothing, only print the command-line spelling of foo// but no files or subdirectories, or should it print all files in the hierarchy? ------------------------------------------------------------------------ 11 Solution proposed by the submitter (optional): At line 17503, add the following sentences to the Description paragraph: Each path operand shall be evaluated with the spelling provided, including all trailing slashes; all other files encountered in the hierarchy shall consist of the concatenation of the path operand, a slash if the path operand did not end in one, and the filename relative to the path operand, where the relative portion contains no dot or dot-dot components, no trailing slashes, and only single slashes between pathname components. At line 17538, add the following sentences to the -name paragraph: Behavior is unspecified if pattern contains a slash character. Also, it is unspecified whether the -name operand strips trailing slashes from a path operand before performing the pattern match. ------------------------------------------------------------------------ Interpretation response ------------------------ The standard is unclear on this issue, and no conformance distinction can be made between alternative implementations based on this. This is being referred to the sponsor. Rationale: ------------- None Notes to the Editor (not part of this interpretation): ------------------------------------------------------- Rationale for rejected or partial changes: At line 17503, add the following sentence to the Description paragraph: Each path operand shall be evaluated unaltered as it was provided, including all trailing slashes; all pathnames for other files encountered in the hierarchy shall consist of the concatenation of the current path operand, a slash if the current path operand did not end in one, and the filename relative to the path operand. The relative portion shall contain no dot or dot-dot components, no trailing slashes, and only single slashes between pathname components. At line 17748, add a new example: 8. Except for the root directory, and "//" on implementations where "//" does not refer to the root directory, no pattern given to -name will match a slash, because trailing slashes are ignored when computing the basename of the file under evaluation. Given two empty directories named foo and bar, the following command: find foo/// bar/// -name foo -o -name 'bar?*' prints only the line "foo///". Forwarded to Interpretations Group: Fri Oct 19 16:54:45 BST 2007 Proposed resolution: Fri Oct 19 16:54:45 BST 2007 Approved: Tue Nov 20 10:42:34 GMT 2007