Austin Group Minutes of the 4 November Teleconference Austin-226 Page 1 of 1 Submitted by Andrew Josey, The Open Group. November 5, 2004 Attendees Andrew Josey, The Open Group Nick Stoughton, USENIX, ISO/IEC OR Don Cragun , Sun, PASC OR Bruce Korb Apologies Ulrich Drepper, Red Hat Mark Brown, IBM, TOG OR Status of next plenary meeting ---------------- Andrew reported that it is possible that the PASC SEC will also meet during the same week. This is to be confirmed. On agenda items, we expect to spend at least one whole day on the topic of utility syntax guidelines / option ordering. Andrew will update the online agenda with the latest thoughts. Other topics ------------ Nick reported that the C committee expect to produce a technical report on some new string functions which have bounds checking and will be looking for feedback. Andrew mentioned that the Base WG has a proposal in a strawman draft to address this another way. Defect Report Processing ------------------------- The group picked up on the latest batch of defect reports, which are available at the following URL: http://www.opengroup.org/austin/aardvark/latest/ XBD ERN 23 signal.h SIGPOLL Accept as marked below This is an Interpretation: The standards states the requirements for SIGPOLL to be supported as part of the XSI option, and conforming implementations must conform to this. However, concerns have been raised about this which are being referred to the sponsor. Rationale: The semantics of SIGPOLL are only specified with functionality in the XSR option. Notes to the Editor for a future revision (not part of this interpretation): In XBD, signal.h In the DESCRIPTION Change from: "[XSI] SIGPOLL T Pollable event." To: "[XSR] SIGPOLL T Pollable event." and start XSI shading on the line below. In the definition of siginfo_t XSR mark and shade the following: "long si_band Band event for SIGPOLL. " The following should be XSR marked and shaded in the table headed Signals/Codes: " SIGPOLL POLL_IN Data input available. POLL_OUT Output buffers available. POLL_MSG Input message available. POLL_ERR I/O error. POLL_PRI High priority input available. POLL_HUP Device disconnected. " And in the table headed Signal/Member/Value XSR mark and shade: "SIGPOLL long si_band Band event for POLL_IN, POLL_OUT, or POLL_MSG." XBD ERN 24 POSIX advisory information Accept as marked below No change required This is presently covered in B.2.8 Realtime, Advisory Information Advisory Information POSIX.1b contains an Informative Annex with proposed interfaces for "realtime files". These interfaces could determine groups of the exact parameters required to do "direct I/O" or "extents". These interfaces were objected to by a significant portion of the balloting group as too complex. A conforming application had little chance of correctly navigating the large parameter space to match its desires to the system. In addition, they only applied to a new type of file (realtime files) and they told the implementation exactly what to do as opposed to advising the implementation on application behavior and letting it optimize for the system the (portable) application was running on. For example, it was not clear how a system that had a disk array should set its parameters. There seemed to be several overall goals: * Optimizing sequential access * Optimizing caching behavior * Optimizing I/O data transfer * Preallocation The advisory interfaces, posix_fadvise() and posix_madvise(), satisfy the first two goals. The POSIX_FADV_SEQUENTIAL and POSIX_MADV_SEQUENTIAL advice tells the implementation to expect serial access. Typically the system will prefetch the next several serial accesses in order to overlap I/O. It may also free previously accessed serial data if memory is tight. If the application is not doing serial access it can use POSIX_FADV_WILLNEED and POSIX_MADV_WILLNEED to accomplish I/O overlap, as required. When the application advises POSIX_FADV_RANDOM or POSIX_MADV_RANDOM behavior, the implementation usually tries to fetch a minimum amount of data with each request and it does not expect much locality. POSIX_FADV_DONTNEED and POSIX_MADV_DONTNEED allow the system to free up caching resources as the data will not be required in the near future. POSIX_FADV_NOREUSE tells the system that caching the specified data is not optimal. For file I/O, the transfer should go directly to the user buffer instead of being cached internally by the implementation. To portably perform direct disk I/O on all systems, the application must perform its I/O transfers according to the following rules: 1. The user buffer should be aligned according to the {POSIX_REC_XFER_ALIGN} pathconf() variable. 2. The number of bytes transferred in an I/O operation should be a multiple of the {POSIX_ALLOC_SIZE_MIN} pathconf() variable. 3. The offset into the file at the start of an I/O operation should be a multiple of the {POSIX_ALLOC_SIZE_MIN} pathconf() variable. 4. The application should ensure that all threads which open a given file specify POSIX_FADV_NOREUSE to be sure that there is no unexpected interaction between threads using buffered I/O and threads using direct I/O to the same file. In some cases, a user buffer must be properly aligned in order to be transferred directly to/from the device. The {POSIX_REC_XFER_ALIGN} pathconf() variable tells the application the proper alignment. The preallocation goal is met by the space control function, posix_fallocate(). The application can use posix_fallocate() to guarantee no [ENOSPC] errors and to improve performance by prepaying any overhead required for block allocation. Implementations may use information conveyed by a previous posix_fadvise() call to influence the manner in which allocation is performed. For example, if an application did the following calls: fd = open("file"); posix_fadvise(fd, offset, len, POSIX_FADV_SEQUENTIAL); posix_fallocate(fd, len, size); an implementation might allocate the file contiguously on disk. Finally, the pathconf() variables {POSIX_REC_MIN_XFER_SIZE}, {POSIX_REC_MAX_XFER_SIZE}, and {POSIX_REC_INCR_XFER_SIZE} tell the application a range of transfer sizes that are recommended for best I/O performance. Where bounded response time is required, the vendor can supply the appropriate settings of the advisories to achieve a guaranteed performance level. The interfaces meet the goals while allowing applications using regular files to take advantage of performance optimizations. The interfaces tell the implementation expected application behavior which the implementation can use to optimize performance on a particular system with a particular dynamic load. The posix_memalign() function was added to allow for the allocation of specifically aligned buffers; for example, for {POSIX_REC_XFER_ALIGN}. The working group also considered the alternative of adding a function which would return an aligned pointer to memory within a user-supplied buffer. This was not considered to be the best method, because it potentially wastes large amounts of memory when buffers need to be aligned on large alignment boundaries. XCU ERN 24 cd relative paths Accept as marked below This is an interpretation. The standards states the requirements for the cd utility and its handling of symbolic links, and conforming implementations must conform to this. However, concerns have been raised about this which are being referred to the sponsor. Rationale: A number of defects have been identified with how the cd utility handles symbolic links. Notes to the Editor (not part of the interpretation): Replace step 6 with the following: "6. If the -P option is in effect, set curpath to the directory operand. Otherwise, set curpath to the string formed by the concatenation of the value of PWD, a slash character, and the operand." Replace step 7 with the following: "7. If the -P option is in effect, proceed to step 10. If curpath does not begin with a slash character, set curpath to the string formed by the concatenation of the value of PWD, a slash character, and the operand." (Note that most of the old step 7 text reappears in the new step 10 below.) Replace step 8b with the following: "b. For each dot-dot component, if there is a preceding component and it is neither root nor dot-dot, then: i. If the preceding component does not refer (in the context of pathname resolution with symbolic links followed) to a directory, then the cd utility shall display an appropriate error message and no further steps shall be taken. ii. The preceding component, all slashes separating the preceding component from dot-dot, dot-dot and all slashes separating dot-dot from the following component (if any) shall be deleted." Insert a new step 9: "9. If curpath is longer than {PATH_MAX} bytes (including the terminating null) and the directory operand was not longer than {PATH_MAX} bytes (including the terminating null), then curpath shall be converted from an absolute pathname to an equivalent relative pathname if possible. This conversion shall always be considered possible if the value of PWD, with a trailing slash added if it does not already have one, is an initial substring of curpath. Whether or not it is considered possible under other circumstances is unspecified. Implementations may also apply this conversion if curpath is not longer than {PATH_MAX} bytes or the directory operand was longer than {PATH_MAX} bytes." Replace the old step 9 with the following: "10. The cd utility shall then perform actions equivalent to the chdir() function called with curpath as the path argument. If these actions fail for any reason, the cd utility shall display an appropriate error message and the remainder of this step shall not be executed. If the -P option is not in effect, the PWD environment variable shall be set to the value that curpath had on entry to step 9 (i.e. before conversion to a relative pathname). If the -P option is in effect, the PWD environment variable shall be set to an absolute pathname for the current working directory and shall not contain filename components that, in the context of pathname resolution, refer to a file of type symbolic link. If there is insufficient permission on the new directory, or on any parent of that directory, to determine the current working directory, the value of the PWD environment variable is unspecified." XCU ERN 36 what constitues a number for test and sh arithmetic expansion Accept as marked below This is an interpretation. The standard does not speak to this issue of what constitutes a number in XCU's test(1) and shell arithmetic expansion, and as such no conformance distinction can be made between alternative implementations based on this. This is being referred to the sponsor. In the event that the primary operand to the primary operators (-gt, -ge, -lt, -le, -eq, -ne) are not integers, implementations are free to provide extensions that would recognize those values or to treat them as errors. The standard is unclear whether the integer arguments to the six binary primaries are only decimal or if octal or hexadecimal are recognized. Historically only decimal values have been recognized. Notes to the editor for a future revision (not part of this interpretation): In XCU test OPERANDS section on p909 Change "integers/integer" on lines 35256-35262 to "decimal integers/integer" XCU ERN 46 uucp removal from specification Accept as marked below This is not a defect in the current standard which reflects existing known practise. It is agreed that this item should be placed into SD/5 for consideration of whether to move it into an option in the next revision. The review group noted that based on feedback to date there appears to still be use and there are freely available implementations XCU ERN 47 c99 -l operand Accept as marked This is an Interpretation The standard is unclear on this issue, and no conformance distinction can be made between alternative implementations based on this. This is being referred to the sponsor. Notes to the Editor for a future revision (not part of this interpretation): On line 8342 change: "An operand is either in the form of a pathname or the form -l library." to: "An operand is either in the form of a pathname or the form -llibrary, or is one of two consecutive operands of the form -l for the first and library for the second." On line 8354 change: "-l library (The letter ell.) Search the library named:" to: "-llibrary (A , the letter ell and a library name.) -l library (Two consecutive operands, the first being a and the letter ell; the second being a library name.) Search the library named:" Add a new para before 8356 p213: For the remainder of this description of the c99 utility, both of the forms -l library and -llibrary are referred to as as -l operand for brevity (even though the -l library form is actually two operands). After line 8359 add a new paragraph: "If the last operand is a -l with no library name, then the c99 utility shall write a diagnostic message to standard error and shall return a non-zero exit status." XSH ERN 62 dbm_open Accept as marked below. This is an Interpretation: The standards states the requirements for the dbm_* functions and their database implementation, and conforming implementations must conform to this. However, concerns have been raised about this which are being referred to the sponsor. Rationale: The current standard describes a specific implementation for storage of a database excluding common existing practise which has evolved yet remained compatible at the application programming interface. Notes to the Editor for a future revision (not part of this interpretation): In the DESCRIPTION Change from: "A datum consists of at least two members, dptr and dsize. The dptr member points to an object that is dsize bytes in length. Arbitrary binary data, as well as character strings, may be stored in the object pointed to by dptr. The database is stored in two files. One file is a directory containing a bitmap of keys and has .dir as its suffix. The second file contains all data and has .pag as its suffix. The dbm_open() function shall open a database. The file argument to the function is the pathname of the database. The function opens two files named file.dir and file.pag. The open_flags argument has the same meaning as the flags argument of open() except that a database opened for write-only access opens the files for read and write access and the behavior of the O_APPEND flag is unspecified. The file_mode argument has the same meaning as the third argument of open()." To: "A datum consists of at least two members, dptr and dsize. The dptr member points to an object that is dsize bytes in length. Arbitrary binary data, as well as character strings, may be stored in the object pointed to by dptr. A database shall be stored in one or two files. When one file is used, the name of the database file shall be formed by appending the suffix ".db" to the file argument given to dbm_open(). When two files are used, the names of the database files shall be formed by appending the suffixes ".dir" and ".pag" respectively to the file argument. The dbm_open() function shall open a database. The file argument to the function is the pathname of the database. The open_flags argument has the same meaning as the flags argument of open() except that a database opened for write-only access opens the files for read and write access and the behavior of the O_APPEND flag is unspecified. The file_mode argument has the same meaning as the third argument of open(). The dbm_open() function need not accept pathnames longer than {PATH_MAX}-4 bytes (including the terminating null), or pathnames with a last component longer than {NAME_MAX}-4 bytes (excluding the terminating null)." Add to APPLICATION USAGE Applications should take care that database pathname arguments specified to dbm_open() are not prefixes of unrelated files. This might be done, for example, by placing databases in a separate directory. Since some implementations use three characters for a suffix and others use four characters for a suffix, applications should ensure that the maximum portable pathname length passed to dbm_open() is no greater than {PATH_MAX}-4 bytes, with the last component of the pathname no greater than {NAME_MAX}-4 bytes. Add to RATIONALE: Previously the standard required the database to be stored in two files, one file being a directory containing a bitmap of keys and havning ".dir" as its suffix. The second file containing all data and having ".pag" as its suffix. This has been changed not to specify the use of the files and to allow newer implementations of the Berkeley DB interface using a single file that have evolved while remaining compatible with the application programming interface. The standard developers considered removing the specific suffixes altogether but decided to retain them so as not to pollute the application file namespace more than necessary and to allow for portable backups of the database. Next Steps ----------- Andrew will update the aardvark reports with the latest inbound defect reports. There are a number of open action items outstanding: 1. Don Cragun Pathname Resolution proposal 2. Larry Dwyer system() and threads 3. Joerg Schilling wording for XCU ERN 1 pax The next teleconference call is scheduled for Nov 11 2004