Minutes of the 20th October 2022 Teleconference Austin-1264 Page 1 of 1 Submitted by Andrew Josey, The Open Group. 21st October 2022 Attendees: Attendees: Don Cragun, IEEE PASC OR Eric Blake, Red Hat, The Open Group OR Eric Ackermann, HPI, University of Potsdam Andrew Josey, The Open Group Geoff Clare, The Open Group Apologies: Nick Stoughton Mark Ziegast Tom Thompson * General news This was a call dedicated to general bugs. * Carried Forward Bug 1560: clarify wording of command substitution https://austingroupbugs.net/view.php?id=1560 Leave this and related bugs 1561 and 1564 open awaiting reviews/discussion. Bug 1273: glob()'s GLOB_ERR/errfunc and non-directory files OPEN https://austingroupbugs.net/view.php?id=1273 We will leave this open awaiting feedback. Bug 768: add "fd-private" POSIX locks to spec OPEN https://austingroupbugs.net/view.php?id=768 Linux has now had OFD locks for several years, and more code in the wild is starting to use it - so we have existing practice (see note 2508) https://www.gnu.org/software/libc/manual/html_mono/libc.html#Open-File-Description-Locks Starting point of resolution at https://posix.rhansen.org/p/bug768 AI to EricB - take the GNU documentation and turn it into a desired action Bug 739: CX requirements for strftime seem to conflict with ISO C OPEN https://austingroupbugs.net/view.php?id=739 Nick completed his action to liaise with the C committee on this issue. C23 is about to go to ballot, so the way to raise the issue would be as a ballot comment. This could be done either as a UK or US national body comment. Andrew confirmed with the UK C Panel that we can submit comments. Bug 728: Restrictions on signal handlers are both excessive and insufficient OPEN https://austingroupbugs.net/view.php?id=728 The alignment with C17 changes the requirements in this area. There is still no allowance for accessing const objects or string literals, so that is something we could consider adding as an extension to C. Or we could raise the issue with the C committee if we want to stay in sync with the C standard. Suggestion: raise it as a C23 ballot comment. Depending on the answer, we could implement what they plan to do, or diverge (or leave things as they are). AI Nick and Geoff: File ballot comments on C23 (through the US and/or UK national bodies to WG14). Bug 708: Make mblen, mbtowc, and wctomb thread-safe for alignment with C11 OPEN https://austingroupbugs.net/view.php?id=708 This was discussed with WG14: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2148.htm#dr_498 DR 498 (URL above) seems to have an agreed wording change from April 2017, but it has not been applied. Nick asked the WG14 convenor what happened to that DR. It appears the item was discussed but in the end the C committee could not agree an acceptable change, and no alternate proposal was available. We will need to accept the C wording for now. Bug 700: Clarify strtoul's behaviour on strings representing negative numbers OPEN https://austingroupbugs.net/view.php?id=700 AI: Nick and Geoff to work together on a C23 ballot comment (against N3047 7.24.1.7 para 5). Bug 689: Possibly unintended allowance for stdio deadlock OPEN https://austingroupbugs.net/view.php?id=689 Change needs to be coordinated with C23. AI: Nick and Geoff to work together on a C23 ballot comment (against N3047 7.23.3 para 3). Bug 618: require isatty and friends to set errno on failure OPEN https://austingroupbugs.net/view.php?id=618 Action item to EricB: email to various lists to collect results for sample program (see bug 503 comment 1005 for starting point) for current errno behavior on various OS Bug 375: Extend test/[...] conditionals: ==, <, >, -nt, -ot, -ef OPEN https://austingroupbugs.net/view.php?id=375 This bug was reviewed, and the following added as note. Given the difficulties identified with the matching operations performed by == and =~ in [[ ... ]] there would seem to be two options for progressing this bug: 1. Keep [[ ... ]] but without =~ and with == "neutered" such that it can only be used portably for fixed-string comparisons. 2. Omit [[ ... ]] altogether but perhaps add -nt, -ot, and -ef to test/[ instead. We would welcome feedback on whether option 1 is an acceptable compromise or is too limiting. We need to resolve this bug by no later than 2022-11-13 if it is to make it into draft 3. After that, only option 2 will be possible (because draft 3 will be "feature complete"). * Current Business Bug 249: Add standard support for $'...' in shell Accepted as Marked https://austingroupbugs.net/view.php?id=249 This item is tagged for Issue 8 Page and line numbers are for the 2013 edition (C138.pdf) At page 2319 line 73573 (XCU section 2.1, Shell Introduction, item 4) change: The shell performs various expansions (separately) on different parts of each command, resulting in a list of pathnames and fields to be treated as a command and arguments; see [xref to 2.6].x to: For each word within a command, the shell processes backslash escape sequences inside dollar-single-quotes (see [xref to 2.2.4]) and then performs various word expansions (see [xref to 2.6]). In the case of a simple command, the results usually include a list of pathnames and fields to be treated as a command name and arguments; see [xref to 2.9]. At page 2320 line 73594 (XCU section 2.2, Quoting) change: The various quoting mechanisms are the escape character, single-quotes, and double-quotes. to: The various quoting mechanisms are the escape character, single-quotes, double-quotes, and dollar-single-quotes. At page 2320 lines 73609-73611 (XCU 2.2.3, Double-Quotes), change: $ The shall retain its special meaning introducing parameter expansion (see Section 2.6.2), a form of command substitution (see Section 2.6.3), and arithmetic expansion (see Section 2.6.4). to: $ The shall retain its special meaning introducing parameter expansion (see Section 2.6.2), a form of command substitution (see Section 2.6.3), and arithmetic expansion (see Section 2.6.4), but shall not retain its special meaning introducing the dollar-single-quotes form of quoting (see [xref to 2.2.4]). At page 2321 lines 73626-73627 (XCU 2.2.3, Double-Quotes), change: A single-quoted or double-quoted string that begins, but does not end, within the "`...`" sequence to: A quoted (single-quoted, double-quoted, or dollar-single-quoted) string that begins, but does not end, within the "`...`" sequence After page 2321 line 73635 (end of XCU section 2.2), insert a new subsection: 2.2.4 Dollar-Single-Quotes A sequence of characters starting with a immediately followed by a single-quote ($') shall preserve the literal value of all characters up to an unescaped terminating single-quote ('), with the exception of certain backslash escape sequences, as follows: \" yields a (double-quote) character, but note that can be included unescaped. \' yields an (single-quote) character. \\ yields a character. \a yields an character. \b yields a character. \e yields an character. \f yields a character. \n yields a character. \r yields a character. \t yields a character. \v yields a character. \cX yields the control character listed in the Value column of [xref to XCU Table 4.21] in the Operands section of the stty utility when X is one of the characters listed in the ^c column of the same table, except that \c\\ yields the control character since the character must be escaped. \xXX yields the byte whose value is the hexadecimal value XX (one or more hex digits). If more than two hex digits follow \x, the results are unspecified. \ddd yields the byte whose value is the octal value ddd (one to three octal digits). The behavior of a immediately followed by any other character, including , is unspecified. In cases where a variable number of characters can be used to specify an escape sequence (\xXX and \ddd), the escape sequence shall be terminated by the first character that is not of the expected type or, for \ddd sequences, when the maximum number of characters specifed has been found, whichever occurs first. These backslash escape sequences shall be processed (replaced with the bytes or characters they yield) immediately prior to word expansion (see [xref to 2.6]) of the word in which the dollar-single-quotes sequence occurs. If a \xXX or \ddd escape sequence yields a byte whose value is 0, it is unspecified whether that null byte is included in the result or if that byte and any following regular characters and escape sequences up to the terminating unescaped single-quote are evaluated and discarded. If the octal value specified by \ddd will not fit in a byte, the results are unspecified. If a \e or \cX escape sequence specifies a character that does not have an encoding in the locale in effect when these backslash escape sequences are processed, the result is implementation-defined. However, implementations shall not replace an unsupported character with bytes that do not form valid characters in that locale's character set. If a backslash escape sequence represents a single-quote character (for example \'), that sequence shall not terminate the dollar-single-quote sequence. At page 2321 lines 73658-73664 (XCU section 2.3 (Token Recognition) point 4), change: 4. If the current character is , single-quote, or double-quote and it is not quoted, it shall affect quoting for subsequent characters up to the end of the quoted text. The rules for quoting are as described in Section 2.2 (on page 2298). During token recognition no substitutions shall be actually performed, and the result token shall contain exactly the characters that appear in the input (except for joining), unmodified, including any embedded or enclosing quotes or substitution operators, between the and the end of the quoted text. The token shall not be delimited by the end of the quoted field. to: 4. If the current character is an unquoted , single-quote, or double-quote or is the first character of an unquoted single-quote sequence, it shall affect quoting for subsequent characters up to the end of the quoted text. The rules for quoting are as described in [xref to Section 2.2]. During token recognition no substitutions shall be actually performed, and the result token shall contain exactly the characters that appear in the input unmodified, including any embedded or enclosing quotes or substitution operators, between the start and the end of the quoted text. The token shall not be delimited by the end of the quoted field. After page 2327 line 73900 (XCU section 2.6, Word Expansions), insert a new bullet point: a At page 2331 lines 74071-74073 (XCU 2.6.3, Command Substitution), change: A single-quoted or double-quoted string that begins, but does not end, within the "`...`" sequence produces undefined results. to: A quoted string that begins, but does not end, within the "`...`" sequence produces undefined results. At page 2333 lines 74157-74158 (XCU section 2.6.7, Quote Removal), change: The quote characters (, single-quote, and double-quote) that were present in the original word shall be removed unless they have themselves been quoted. to: The quote character sequence single-quote and the single-character quote characters (, single-quote, and double-quote) that were present in the original word shall be removed unless they have themselves been quoted. Note that the single-quote character that terminates a single-quote sequence is itself a single-character quote character. Note that after quote removal the shell still remembers which characters were quoted. This is necessary for purposes such as matching patterns in a case conditional construct (see [xref to 2.9.4.3] and [xref to 2.13]). At page 2348 lines 74718-74719 (the Note in XCU section 2.10.2 (Shell Grammar Rules) rule 1), change: Because at this point characters are retained in the token, quoted strings cannot be recognized as reserved words. to: Because at this point quoting characters (, single-quote, , and the single-quote sequence) are retained in the token, quoted strings cannot be recognized as reserved words. After page 3677 line 125685 (end of XRAT C.2.2.3), insert a new paragraph: The $'...' construct does not retain its special meaning inside double quotes. This was discussed by the standard developers and rejected. Note that $'...' is a quoting mechanism and not an expansion. Losing the special meaning inside double quotes is consistent with other quoting mechanisms losing their special meaning when quoted. After the above insertion and before page 3678 line 125686 (XRAT C.2.3), insert a new subsection: C.2.2.4 Dollar-Single-Quotes The $'...' quoting construct has been implemented in several recent shells. It is similar to character string literals ("...") in the ISO C standard with the following exceptions: The \x escape sequence in C can be followed by an arbitrary number of hexadecimal digits. The ksh93 implementation of $'...' also consumes an arbitrary number of hexadecimal digits; bash consumes at most two hexadecimal digits in this case. This standard leaves the result unspecified if more than two hexadecimal digits follow \x. (Note that a hexadecimal escape followed by a literal hexadecimal character can always be represented as $'\xXX'X.) The \c escape sequence is not included in the ISO C standard. There was also some disagreement in shells that historically supported \c escape sequences in $'...'. These include: whether \cA through \cZ produced the byte values 1 through 26, respectively or supported the codeset independent control character as specified by the stty utility. This standard requires codeset independence. whether \c[, \c\\, \c], \c^, \c_, and \c? could be used to yield the , , , , , and control characters, respectively. This standard requires support for all of the control characters except NULL (matching what is done in the stty utility). whether \c\\ or \c\ was used to represent . This standard requires \c\\ to make backslash escape processing consistent. The implementors of the most common shells that implement $'\cX' agreed to convert to the behavior specified in this standard. Some shells also allow \c to act as an inverse function to \cX (i.e., \cm and \cM yield and \c yields m or M. This standard leaves this behavior implementation-defined. The \e escape sequence is not included in the ISO C standard, but was provided by all historical shells that supported $'...'. Some also supported \E as a synonym. One member of the group objected to adding \e because the control character is not required to be in the portable character set. The \e sequence is included because many historical users of $'...' expect it to be there. The \E sequence is not included in this standard because escape sequences that start with followed by an uppercase letter (except \U) are reserved by the C Standard for implementation use. The \ddd octal escape sequence and the \xXX hexadecimal escape sequence can be used to insert a null byte into a C Standard character string literal and into a $'...' quoted word in this standard. In C, any characters specified after that null byte (including escape sequences) continue to be processed and added to the character string literal. In $'...' in the shell this standard allows the equivalent behavior but also allows the null byte and all remaining characters up to the terminating unescaped single-quote be evaluated and discarded. The latter (which was historic practice in bash, but not in ksh93) allows an escape sequence producing a null byte to terminate the dollar-single-quoted expansion, but not terminate the token in which it appears if there are characters remaining in the token. For example: printf a$'b\0c\''d is required by this standard to produce: abd while historic versions of ksh93 produced: ab The ISO C standard specifies \uXXXX and \UXXXXXXXX escape sequences. These need not be supported by $'...' in the shell. They were omitted because current shell implementations that support them differ in behavior. In particular, some shells always convert them to the UTF-8 encoding for the named character, even if the current locale's character set does not have UTF-8 encoding. The double-quote (") character can be used literally, while the single-quote (') character must be represented as an escape sequence. In C, single-quote can be used literally, while double-quote requires an escape sequence. A immediately followed by a has unspecified behavior. In C, this sequence is used for line continuations, where both the and are deleted and a diagnostic is required if a closing quote is not encountered before a that is not preceded by . In current shell implementations, three different behaviors have been observed. Backslash escape sequences not described in the standard result in unspecified behavior. In C, the result is not a token and a diagnostic is required. This allows shells to recognize other backslash escape sequences in other ways as extensions to the standard's requirements. Furthermore, existing implementations already had different behaviors for some backslash escape sequences when $'...' processing was added to the standard. This standard makes the results implementation-defined if \e or \cX specifies a character that is not present in the current locale. Application authors should note that implementations are permitted to have a wide range of behaviors when encountering an unsupported character. For example: the shell might produce an error, possibly causing the shell to terminate the unsupported character might be silently discarded the unsupported character might be replaced with another character of a different character class the unsupported character might be replaced with a shell-special character (e.g., '?') the unsupported character might be replaced with multiple characters, shell-special or regular (e.g. if is not supported $'\e' may be replaced by "???", "XXX" or "") However, implementations must document their behavior, and they are prohibited from replacing an unsupported character with bytes that do not form valid characters in the current locale's character set (e.g., encoding in UTF-8 when the locale has a 7-bit character set). This standard does not specify a way for script authors to determine beforehand whether a particular \cX sequence specifies a character that exists in the current locale. At the time this feature was standardized, no known implementations provided such a capability. Note that the escape sequences recognized by $'...', file format notation (see [xref to Table 5-1]), XSI-conforming implementations of the echo utility (see the utility's operands section on [xref to echo]), and the printf utility's format operand (see the utility's extended description on [xref to printf]) are not the same. Some escape sequences are not recognized by all of the above, the \c escape sequence in echo is not at all like the \c escape sequence in $'...', octal escape sequences in some of the above accept one to four octal digits and require a leading zero while others accept one to three octal digits and do not require a leading zero. Bug 144: Standard lacks a (possibly XSI) interface to associate a session to a TTY; tcsetsid() Rejected https://austingroupbugs.net/view.php?id=144 Because we have been unable to get a copyright release for this interface in the last 11 years, this bug is being rejected. If someone can obtain copyright release, please submit a new bug with the appropriate release information. Bug 1609: consequences of giving localedef a bad charmap Accepted https://austingroupbugs.net/view.php?id=1609 This item is tagged for TC3-2008. Bug 1608: Suggesting informative texts for bug-id339 Accepted as Marked https://austingroupbugs.net/view.php?id=1608 This item is tagged for Issue 8. After D2.1 page 2061 line 66817 section sysconf(), add: Although the queries _SC_NPROCESSORS_CONF and _SC_NPROCESSORS_ONLN provide a way for a class of "heavy-load" application to estimate the optimal number of threads that can be created to maximize throughput, real-world environments have complications that affect the actual efficiency that can be achieved. For example: There may be more than one "heavy-load" application running on the system. The system may be on battery power, and applications should co-ordinate with the system to ensure that a long-running task can pause, resume, and successfully complete even in the event of a power outage. In case a portable "heavy-load" application wants to avoid the use of extensions, its developers may wish to create threads based on the logical partition of the long-running task, or utilize heuristics such as the ratio between CPU time and real time. Bug 1607: Operator associativity for address chain operator is not specified Accepted as Marked https://austingroupbugs.net/view.php?id=1607 This item is tagged for TC3-2008. On page 2680 line 87365 section ed, change: Commands accept zero, one, or two addresses. If more than the required number of addresses are provided to a command that requires zero addresses, it shall be an error. Otherwise, if more than the required number of addresses are provided to a command, the addresses specified first shall be evaluated and then discarded until the maximum number of valid addresses remain, for the specified command. to: Commands accept zero, one, or two addresses. If one or more addresses are provided to a command that accepts zero addresses, it shall be an error. Otherwise, if more than the maximum number of accepted addresses are provided to a command, the addresses shall be evaluated from first to last and then discarded, until the maximum number of accepted addresses for that command remain. On page 2691 line 87812 section ed, change: Any number of addresses can be provided to commands taking addresses; for example, "1,2,3,4,5p" prints lines 4 and 5, because two is the greatest valid number of addresses accepted by the print command. to: More than the maximum number of accepted addresses can be provided to commands taking addresses; for example, "1,2,3,4,5p" prints lines 4 and 5, because two is the maximum number of addresses accepted by the print command. On page 2691 line 87818 section ed, change: the search origin for the "/foo/" command depends on this. to: the search origin for the "/foo/" address depends on this. Bug 1606: find: is a directory loop considered to be an error? Accepted as Marked https://austingroupbugs.net/view.php?id=1606 This item is tagged for TC3-2008 After: When it detects an infinite loop, find shall write a diagnostic message to standard error and shall either recover its position in the hierarchy or terminate. add a new sentence: In either case, the final exit status shall be non-zero. Bug 1605: bind: AF_UNIX: extend EADDRINUSE description beyond "symbolic link" Accepted as Marked https://austingroupbugs.net/view.php?id=1605 This item is tagged for TC3-2008 change from: If the address family of the socket is AF_UNIX and the pathname in address names a symbolic link, bind() shall fail and set errno to [EADDRINUSE]. to: If the address family of the socket is AF_UNIX and the pathname in address names an existing file, including a symbolic link, bind() shall treat the address as already in use; see ERRORS below. Bug 1604: stty default output for control characters Accepted as Marked https://austingroupbugs.net/view.php?id=1604 This item is tagged for TC3-2008 Change italic undef on line 108235 (D2.1) to "" Change "values" on line 107979 and 107982 to "value". On ll. 108152, 108155 replace "icanon" with "−icanon". Bug 1603: minor error in the pathname resolution Accepted as Marked https://austingroupbugs.net/view.php?id=1603 This item is tagged for TC3-2008 On page 94 line 2839 change: Each filename in the pathname is located in the directory specified by its predecessor (for example, in the pathname fragment a/b, file b is located in directory a). to: Each filename in the pathname is located in the directory specified by its predecessor (for example, in the pathname fragment a/b, file b is located in the directory specified by a). On page 94 line 2851 change: unless the last pathname component before the trailing characters names an existing directory or a directory entry that is to be created for a directory immediately after the pathname is resolved. to: unless the last pathname component before the trailing characters resolves (with symbolic links followed - see below) to an existing directory or a directory entry that is to be created for a directory immediately after the pathname is resolved. Next Steps ---------- The next calls are on: Mon 2022-10-24 (general bugs) Thu 2022-10-27 (general bugs) The calls are for 90 minutes Apologies in advance: Geoff Clare may have no power: 2022-10-24 Eric Blake: 2022-10-27, 2022-11-03 Calls are anchored on US time. (8am Pacific) Please check the calendar invites for dial in details. Bugs are at: https://austingroupbugs.net An etherpad is usually up for the meeting, with a URL using the date format as below: https://posix.rhansen.org/p/20xx-mm-dd (For write access this uses The Open Group single sign on, for those individuals with gitlab.opengroup.org accounts. Please contact Andrew if you need to be setup)