AI-2005-01-01 Austin-382 Page 1 of 1 Submitted by Don Cragun, Sun. Sep 7, 2007 This is a response to my Action Item to provide a paper on pathname resolution. It is also a followup to austin-review-l alias sequence #1723 which turned into draft interpretation AI-016 on November 14th, 2003. As was noted at that point, the changes being suggested can't be made until the next revision of the standard. In this discussion, page and line numbers refer to draft 3 in the current POSIX.1-200x revision project. As has been noted in discussions over the last 3.5 years, pathname resolution sometimes is applied to existing files and sometimes to files that are about to be created, but the pathname resolution general concept never clearly stated how this is supposed to happen. Wording being changed in this proposal is the same in POSIX.1-2003 (POSIX.1-2001 with TC1 applied), POSIX.1-2004 (POSIX.1-2001 with TC1 and TC2 applied), and in draft 3 of the revision currently in ballot, except for a couple of editorial issues that have been corrected from the earlier versions. Those corrections do not affect this discussion. The basic idea remains the same: The standard needs to clearly specify that a trailing slash in a pathname that contains any non-slash characters can only be used to reference a directory. As was pointed out in the earlier discussions, pathname resolution as described in the standard only clearly specifies what happens when resolving existing files. By definition, pathname resolution has to fail on any attempt to resolve a pathname that names a file that is to be created by the call in question (e.g., mkfifo() and mkdir()). It needs to also clearly specify how pathname resolution works for files that are being created. @ page draft 3: 99 line 2967-2169 section 4.12 objection {dwc:PR-1} I have discussed this issue with several people that were involved in the original discussions that lead to the text in the P1003.1a drafts that were incorporated into the definition of the Pathname Resolution general concept in XBD subclause 4.12. All have agreed that the current wording is intended to cover two cases (resolution of a pathname of a file that is about to be created, and resolution of a pathname of an existing file), but does not clearly describe the differences between these cases. Furthermore, the current text does not accurately describe how a trailing slash on a pathname (for any pathname that contains any non-slash characters) is intended to fail if the pahtname does not resolve to an existing directory or to the pathname of a directory that is to be created immediately after resolution of everything but the final component. The normative text of the standard is silent on what happens with rename("dir/", "dir2") when "dir2" names an existing empty directory or does not name an existing file, but the rationale clearly states that "dir/." and "dir/.." are not allowed. The intent of the working group that drafted these changes was to allow a trailing slash here. We have also discussed whether mkdir("new/") and rename("dir/", "new/") should succeed or fail when "new" does not name an existing file.) The way that this requirement is worded in the current POSIX standard (but using POSIX.1-200x draft 3 page and line numbers) is: ``A pathname that contains at least one non-slash character and that ends with one or more trailing slash characters shall be resolved as if a single dot character ('.') were appended to the pathname." This wording could be interpreted to require failure for these cases: 1. rmdir("dir/.") is required to fail with EINVAL, 2. mkdir("new/.") is required to fail with ENOTDIR, and 3. rename(..., "new/.") is required to fail with ENOTDIR. I believe the intent of the working group was that if a pathname has a trailing slash character, it can only be used to refer to a directory. The original SVID3 wording (from Volume 1, page 4-11, top two paragraphs) in the definition of "pathname and path prefix" includes: "A pathname is used to identify a file. It consists of at most, {PATH_MAX} bytes, including a terminating null character. It has an optional beginning slash, followed by zero or more filenames separated by *** slashes. If the pathname refers to a directory, it *** may also have one or more trailing slashes. Multiple consecutive slashes may be interpreted in an implementation-defined manner, although more than two leading slashes are treated as a single slash. "If a pathname begins with a slash, the path search begins at the root directory. Otherwise, the search *** begins from the current working directory. If a *** pathname refers to a directory, it may also have one *** or more trailing slashes. Multiple consecutive slashes are considered the same as a single slash." (Note that SVID3 was one of the reference documents used frequently in the development of POSIX.1-1990.) I believe the two complete sentences on the line marked with *** above cover the same intent, but System V did not enforce the part about trailing slashes being allowed only after directories and always threw away trailing slash characters. (In fact, SVID3 never specified what is supposed to happen if a pathname that does not refer to a directory has one or more trailing slashes.) Unfortunately, the wording from SVID3 is ambiguous. "If a pathname refers to a directory" could mean "If a pathname refers to an EXISTING directory" or could mean "If a pathname refers to SOMETHING THAT WILL BE USED AS a directory". The unanimous consensus of the people I have discussed this with so far is that the desired meaning is the latter, except for a special case for the last argument to rename(). A short summary of the discussions leading to this conclusion follows: In these examples "dir" and "dir2" refer to existing files of type directory and "new" refers to a non-existent file. mkdir(): We all believe that mkdir("new") should be equivalent to mkdir("new/") although there were dissenting opinions at the start of the discussion. There are several uses of mkdir("new/") in third party applications and this is clearly in line with one interpretation of the SVID3 wording. Many of these applications are creating a temporary directory and saving the directory name (with the trailing slash) for later use to create pathnames for files to be created in that directory. These applications save a step by including the trailing slash when the temporary directory is created and did not believe they were breaking the intended requirements of any standard. rename(): We all believe that rename("dir", "new") and rename("dir/", "new") should be equivalent. We all believe that rename("dir", "new") should not be treated the same as rename("dir", "new/") although there were dissenting opinions at the start of the discussion. We believe that in this case, "new/" should fail if new does not name an existing directory. The rename() function is very similar to the mv utility when given two operands that are on the same file system. For the mv utility, the command: mv dir dir2/ should always rename dir to be dir2/dir or fail (matching historic BSD mv behavior). To match this, rename("dir", "dir2") and rename("dir", "dir2/") should rmdir("dir2") and rename("dir", "dir2") as an atomic operation, but rename("dir", "new/") should fail (since "new" is not an existing directory). rmdir(): We all believe that rmdir("dir") and rmdir("dir/") should be equivalent. If the Action suggested below is accepted, these changes would allow applications to perform the following operations successfully (assuming non-existing does not name any existing file and assuming existing-dir and another-existing-dir resolve to existing directories upon which the process has permission to operate): mkdir("non-existing/", ...) mkdir("non-existing", ...) rmdir("existing-dir/") rmdir("existing-dir") rename("existing-dir/", "another-existing-dir/") rename("existing-dir", "another-existing-dir/") rename("existing-dir/", "another-existing-dir") rename("existing-dir", "another-existing-dir") And all of the following would fail (assuming non-dir names an existing file that is not a directory): open("non-dir/", ..., ...) mkdir("non-dir/") unlink("non-dir/") rename("non-dir/", ...) rename(..., "non-dir/") rename(..., "non-existing/") Action: I propose the following changes in the next revision of the standard to resolve this issue: Change the first paragraph of the description of the Pathname Resolution general concept on P99, L2954-2955 from: ``Pathname resolution is performed for a process to resolve a pathname to a particular file in a file hierarchy. There may be multiple pathnames that resolve to the same file.'' to: ``Pathname resolution is performed for a process to resolve a pathname to a particular directory entry for a file in the file hierarchy. There may be multiple pathnames that resolve to the same file. When a process resolves a pathname of an existing file, the entire pathname shall be resolved as described below. When a process resolves a pathname of a file that is to be created immediately after the pathname is resolved, pathname resolution terminates when all components of the path prefix of the last component have been resolved. It is then the responsibility of the process to create the final component.'' Change the fourth paragraph of the description of the Pathname Resolution general concept on P99, L2967-2969 from: ``A pathname that contains at least one non-slash character and that ends with one or more trailing slashes shall be resolved as if a single dot character ('.') were appended to the pathname.'' to: ``A pathname that contains at least one non-slash character and that ends with one or more trailing slashes shall not be resolved successfully unless the last pathname component before the trailing slashes names an existing directory or a directory entry that is to be created for a directory immediately after the pathname is resolved. Interfaces using pathname resolution may specify additional constraints* when a pathname does not name an existing directory contains at least one non-slash character and contains one or more trailing slashes.'' __________________ ``* The only interfaces that further constrain pathnames in this standard are the rename() and renameat() functions (see XSH rename(), on page xxx) and the rm utility (see XCU rm, on page xxx).'' Add the new sentence: ``In this case, if target_file ends with a trailing slash character, mv shall treat this as an error and no source_file operands will be processed.'' to the end of the paragraph on XCU P2876, L95510-95513 (1st paragraph of the description of mv). Add the new sentence (CX shaded): ``If the new argument does not resolve to an existing file of type directory and the new argument contains at least one non-slash character and ends with one or more trailing slashes after all symbolic links have been processed, rename() shall fail.'' to the end of the paragraph on XSH, P1740, L55827-55828 (2nd paragraph of the description of rename() and renameat()). Change XSH, P1741, L55902-55903 (shall fail ENOTDIR error for rename() and renameat()) from: ``[ENOTDIR] A component of either path prefix is not a directory; or the old argument names a directory and the new argument names a non-directory file.'' to: ``[ENOTDIR] A component of either path prefix is not a directory; the old argument names a directory and the new argument names a non-directory file; or the new argument names a non-existant file and ends with a trailing slash ('/') character.'' all CX shaded. Change XRAT, P3350, L113791-113792 from (rationale for the Pathname Resolution general concept): ``POSIX.1-200x requires that a pathname with a trailing slash character be treated as if it had a trailing "/." everywhere.'' to: ``An earlier version of this standard required that a pathname with a trailing slash character be treated as if it had a trailing "/." everywhere. This specification was ambiguous. In situations where the intent was that the application wanted to require the implementation to accept the pathname only if it named a directory (existing or to be created as a result of the call performing pathname resolution), literally adding a "." after the trailing slash could be interpreted to require use of that pathname to fail. Some of the uses that created ambiguous requirements included mkdir("newdir/") and rmdir("existing-dir/"). This standard requires that a pathname with a trailing slash be rejected unless it refers to a file that is a directory or to a file that is to be created as a directory. The rename() function and the mv utility further specify that a trailing slash cannot be used on a pathname naming a file that that does not exist when used as the last argument to rename() or renameat(), or the as the last operand to mv.'' Cheers, Don