Minutes of the 2nd July 2015 Teleconference Austin-716 Page 1 of 1 Submitted by Andrew Josey, The Open Group. 3rd July 2015 Attendees: Andrew Josey, The Open Group Don Cragun, IEEE PASC OR Roger Faulkner, Oracle, The Open Group OR Joerg Schilling, FOKUS Fraunhofer Geoff Clare, The Open Group Jim Grisanzio, Oracle Martin Rehak, Oracle Nick Stoughton, USENIX, ISO/IEC JTC 1/SC 22 OR Mark Ziegast, SHware Systems Richard Hansen, BBN David Clissold, IBM Apologies Eric Blake, Red Hat * General news Andrew reported that he has worked with Geoff and Cathy to commence setup of a build tree for the merged draft. A gating item will be production of an updated TC draft - which is an action on Andrew. Andrew took an action to notify Michael Kipness of our plans regarding balloting. At the moment the anticipated cutoff for new bugs is when the current pending interpretations complete - now around August 3rd. Andrew still has the action to form the IEEE ballot group. Andrew has an outstanding action to investigate the next steps with ISO balloting. Its usually a matter of the project editor submitting the text to the secretariat and then requesting a ballot. * Outstanding actions ( Please note that I have flushed this section to shorten the minutes - to locate the last set of outstanding actions, look to the minutes from 26 Feb 2015) Bug 0000887: printf and other functions appear many times in search results OPEN http://austingroupbugs.net/view.php?id=887 Andrew is investigating. Bug 0000900: add qsort_r OPEN http://austingroupbugs.net/view.php?id=900 The consensus was that its a good idea to add the suggested interface. The usual requirements regarding a sponsor for a new interface apply. Action: Open Group OR , to ask the Base WG if they wish to sponsor the additional qsort interface proposed here. Bug 0000901: reserve _POSIX* shell option namespace for future use OPEN http://austingroupbugs.net/view.php?id=901 The forward plan for this bug remains as before: Richard: file a new bug report with a concrete feature that would use the _POSIX* namespace (as motivation for reserving set -o _POSIX*) All: debate the proposed feature. If it's something we want, then revisit bug #901. If not, close bug #901. Bug 0000922: Implementations should be allowed to change/remove implementation-defined environment variables OPEN http://austingroupbugs.net/view.php?id=922 This item remains open. Action on Eric: propose wording for Issue 8 to add secure_getenv(), and make it clear that deleting from environment without explicit request is not compliant, but ignoring is fine. For Issue 7 TC 2: Create new bug to add additional conditions on what makes TMPDIR valid, vs. undefined behavior; also add future directions to getenv() to mention secure_getenv() * Current Business Bug 663: Specification of str[n]casecmp is ambiguous Accepted as marked http://www.austingroupbugs.net/view.php?id=663 Geoff has added a note with a suggested new resolution This item is tagged for TC2-2008 An interpretation is required Interpretation response: The standard is unclear on this issue, and no conformance distinction can be made between alternative implementations based on this. This is being referred to the sponsor. Rationale: The intention was always that the POSIX locale should have an 8-bit-clean single-byte encoding. The omission of an explicit statement to that effect was an oversight. We are also seeking proposals for the standardization of a "POSIX.UTF-8" locale for Issue 8. Notes to the Editor (not part of this interpretation): On Page: 128 Line: 3596 Section: 6.2 Character Encoding (2013 edition Page: 128 Line: 3623) Change from: The POSIX locale contains the characters in [xref to Table 6-1], which have the properties listed in [xref to 7.3.1]. In other locales, the presence, meaning, and representation of any additional characters are locale-specific. to: The POSIX locale shall contain 256 single-byte characters including the characters in [xref to Table 6-1] and [xref to Table 6-2], which have the properties listed in [xref to 7.3.1]. It is unspecified whether characters not listed in those two tables are classified as punct or cntrl, or neither. Other locales shall contain the characters in [xref to Table 6-1] and may contain any or all of the control characters identified in [xref to Table 6-2] that are not included in [xref to Table 6-1]; the presence, meaning, and representation of any additional characters are locale-specific. [Note to the TC2 editors: the above is a layered change.] On Page: 136 Line: 3849 Section: 7.2 POSIX locale (2013 edition Page: 136 Line: 3885) Delete: The tables in Section 7.3 describe the characteristics and behavior of the POSIX locale for data consisting entirely of characters from the portable character set and the control character set. For other characters, the behavior is unspecified. On Page: 139 Line: 3996 Section: 7.3.1 LC_CTYPE (2013 edition Page: 139 Line: 4032) Change from: In the POSIX locale, the 26 uppercase letters shall be included: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z to: In the POSIX locale, only: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z shall be included. On Page: 139 Line: 4003 Section: 7.3.1 LC_CTYPE (2013 edition Page: 139 Line: 4039) Change from: In the POSIX locale, the 26 lowercase letters shall be included: a b c d e f g h i j k l m n o p q r s t u v w x y z to: In the POSIX locale, only: a b c d e f g h i j k l m n o p q r s t u v w x y z shall be included. On Page: 139 Line: 4009 Section: 7.3.1 LC_CTYPE (2013 edition Page: 140 Line: 4045) Change from: In the POSIX locale, all characters in the classes upper and lower shall be included. to: In the POSIX locale, only characters in the classes upper and lower shall be included. On Page: 141 Line: 4091 Section: 7.3.1 LC_CTYPE (2013 edition Page: 141 Line: 4127) Change from: In the POSIX locale, at a minimum, the 26 lowercase characters: to: In the POSIX locale, the 26 lowercase characters: On Page: 142 Line: 4105 Section: 7.3.1 LC_CTYPE (2013 edition Page: 142 Line: 4141) Change from: In the POSIX locale, at a minimum, the 26 uppercase characters: to: In the POSIX locale, the 26 uppercase characters: On Page: 143 Line: 4142 Section: 7.3.1.1 LC_CTYPE Category in the POSIX Locale (2013 edition Page: 143 Line: 4178) Change from: The character classifications for the POSIX locale follow; the code listing depicts the localedef input, and the table represents the same information, sorted by character. to: The minimum character classifications for the POSIX locale follow; the code listing depicts the localedef input, and the table represents the same information, sorted by character. Implementations may add additional characters to the cntrl and punct classifications but shall not make any other additions. On Page: 143 Line: 4145 Section: 7.3.1.1 LC_CTYPE Category in the POSIX Locale (2013 edition Page: 143 Line: 4181) Change from: # The following is the POSIX locale LC_CTYPE. # "alpha" is by default "upper" and "lower" # "alnum" is by definition "alpha" and "digit" # "print" is by default "alnum", "punct", and the # "graph" is by default "alnum" and "punct" to: # The following is the minimum POSIX locale LC_CTYPE. # "alpha" is by definition "upper" and "lower" # "alnum" is by definition "alpha" and "digit" # "print" is by definition "alnum", "punct", and the # "graph" is by definition "alnum" and "punct" On Page: 151 Line: 4531 Section: 7.3.2.6 LC_COLLATE Category in the POSIX Locale (2013 edition Page: 151 Line: 4567) Change from: The collation sequence definition of the POSIX locale follows; the code listing depicts the localedef input. to: The minimum collation sequence definition of the POSIX locale follows; the code listing depicts the localedef input. All characters not explicitly listed here shall be inserted in the character collation order after the listed characters and shall be assigned unique primary weights. If the listed characters have ASCII encoding, the other characters shall be in ascending order according to their coded character set values; otherwise, the order of the other characters is unspecified. The collation sequence shall not include any multi-character collating elements. On Page: 151 Line: 4534 Section: 7.3.2.6 LC_COLLATE Category in the POSIX Locale (2013 edition Page: 151 Line: 4570) Change from: # This is the POSIX locale definition for the LC_COLLATE category. # The order is the same as in the ASCII codeset. to: # This is the minimum input for the POSIX locale definition for the # LC_COLLATE category. Characters in this list are in the same order # as in the ASCII codeset. On Page: 355 Line: 11953 Section: (2013 edition Page: 358 Line: 12042) After: {MB_CUR_MAX} Maximum number of bytes in a character specified by the current locale (category LC_CTYPE). add a new sentence: [CX]In the POSIX locale the value of {MB_CUR_MAX} shall be 1.[/CX] On Page: 622 Line: 21263 Section: btowc() (2013 edition Page: 627 Line: 21451) In the RETURN VALUE section, add a new sentence: [CX]In the POSIX locale, btowc() shall not return WEOF if c has a value in the range 0 to 255 inclusive.[/CX] On Page: 1270 Line: 41775 Section: mblen() (2013 edition Page: 1282 Line: 42472) In the ERRORS section, change from: [XSI][EILSEQ] An invalid character sequence is detected.[/XSI] to: [CX][EILSEQ] An invalid character sequence is detected. In the POSIX locale an EILSEQ error cannot occur since all byte values are valid characters.[/CX] On Page: 1272 Line: 41825 Section: mbrlen() (2013 edition Page: 1284 Line: 42526) In the ERRORS section, change from: [EILSEQ] An invalid character sequence is detected. to: [EILSEQ] An invalid character sequence is detected. [CX]In the POSIX locale an EILSEQ error cannot occur since all byte values are valid characters.[/CX] On Page: 1275 Line: 41890 Section: mbrtowc() (2013 edition Page: 1287 Line: 42594) In the ERRORS section, change from: [EILSEQ] An invalid character sequence is detected. to: [EILSEQ] An invalid character sequence is detected. [CX]In the POSIX locale an EILSEQ error cannot occur since all byte values are valid characters.[/CX] On Page: 1278 Line: 41998 Section: mbsrtowcs() (2013 edition Page: 1290 Line: 42706) In the ERRORS section, change from: [EILSEQ] An invalid character sequence is detected. to: [EILSEQ] An invalid character sequence is detected. [CX]In the POSIX locale an EILSEQ error cannot occur since all byte values are valid characters.[/CX] On Page: 1279 Line: 42051 Section: mbstowcs() (2013 edition Page: 1291 Line: 42760) In the ERRORS section, change from: [XSI][EILSEQ] An invalid byte sequence is detected.[/XSI] to: [CX][EILSEQ] An invalid character sequence is detected. In the POSIX locale an EILSEQ error cannot occur since all byte values are valid characters.[/CX] On Page: 1281 Line: 42104 Section: mbtowc() (2013 edition Page: 1293 Line: 42815) In the ERRORS section, change from: [XSI][EILSEQ] An invalid character sequence is detected.[/XSI] to: [CX][EILSEQ] An invalid character sequence is detected. In the POSIX locale an EILSEQ error cannot occur since all byte values are valid characters.[/CX] On Page: 2455 Line: 78223 Section: awk (2013 edition Page: 2478 Line: 79587) In the APPLICATION USAGE section, add a new paragraph: When using awk to process pathnames, it is recommended that LC_ALL, or at least LC_CTYPE and LC_COLLATE, are set to POSIX or C in the environment, since pathnames can contain byte sequences that do not form valid characters in some locales, in which case the utility's behavior would be undefined. In the POSIX locale each byte is a valid single-byte character, and therefore this problem is avoided. On Page: 2537 Line: 81424 Section: comm (2013 edition Page: 2561 Line: 82825) In the APPLICATION USAGE section, add a new paragraph: When using comm to process pathnames, it is recommended that LC_ALL, or at least LC_CTYPE and LC_COLLATE, are set to POSIX or C in the environment, since pathnames can contain byte sequences that do not form valid characters in some locales, in which case the utility's behavior would be undefined. In the POSIX locale each byte is a valid single-byte character, and therefore this problem is avoided. On Page: 2786 Line: 90886 Section: grep (2013 edition Page: 2810 Line: 92292) In the APPLICATION USAGE section, add a new paragraph: When using grep to process pathnames, it is recommended that LC_ALL, or at least LC_CTYPE and LC_COLLATE, are set to POSIX or C in the environment, since pathnames can contain byte sequences that do not form valid characters in some locales, in which case the utility's behavior would be undefined. In the POSIX locale each byte is a valid single-byte character, and therefore this problem is avoided. On Page: 2792 Line: 91089 Section: head (2013 edition Page: 2816 Line: 92496) In the APPLICATION USAGE section, change from: None. to: When using head to process pathnames, it is recommended that LC_ALL, or at least LC_CTYPE and LC_COLLATE, are set to POSIX or C in the environment, since pathnames can contain byte sequences that do not form valid characters in some locales, in which case the utility's behavior would be undefined. In the POSIX locale each byte is a valid single-byte character, and therefore this problem is avoided. On Page: 2817 Line: 91965 Section: join (2013 edition Page: 2841 Line: 93377) In the APPLICATION USAGE section, add a new paragraph: When using join to process pathnames, it is recommended that LC_ALL, or at least LC_CTYPE and LC_COLLATE, are set to POSIX or C in the environment, since pathnames can contain byte sequences that do not form valid characters in some locales, in which case the utility's behavior would be undefined. In the POSIX locale each byte is a valid single-byte character, and therefore this problem is avoided. On Page: 3159 Line: 105039 Section: sed (2013 edition Page: 3185 Line: 106550) In the APPLICATION USAGE section, add a new paragraph: When using sed to process pathnames, it is recommended that LC_ALL, or at least LC_CTYPE and LC_COLLATE, are set to POSIX or C in the environment, since pathnames can contain byte sequences that do not form valid characters in some locales, in which case the utility's behavior would be undefined. In the POSIX locale each byte is a valid single-byte character, and therefore this problem is avoided. On Page: 3187 Line: 106180 Section: sort (2013 edition Page: 3214 Line: 107719) In the APPLICATION USAGE section, add a new paragraph: When using sort to process pathnames, it is recommended that LC_ALL, or at least LC_CTYPE and LC_COLLATE, are set to POSIX or C in the environment, since pathnames can contain byte sequences that do not form valid characters in some locales, in which case the utility's behavior would be undefined. In the POSIX locale each byte is a valid single-byte character, and therefore this problem is avoided. On Page: 3214 Line: 107142 Section: tail (2013 edition Page: 3241 Line: 108681) In the APPLICATION USAGE section, add a new paragraph: When using tail to process pathnames, and the -c option is not specified, it is recommended that LC_ALL, or at least LC_CTYPE and LC_COLLATE, are set to POSIX or C in the environment, since pathnames can contain byte sequences that do not form valid characters in some locales, in which case the utility's behavior would be undefined. In the POSIX locale each byte is a valid single-byte character, and therefore this problem is avoided. On Page: 3250 Line: 108473 Section: tr (2013 edition Page: 3277 Line: 110019) In the RATIONALE section, delete: This meant that historical practice of being able to specify tr -cd\000-\177 (which would delete all bytes with the top bit set) would have no effect because, in the C locale, bytes with the values octal 200 to octal 377 are not characters. On Page: 3283 Line: 109551 Section: uniq (2013 edition Page: 3310 Line: 111099) In the APPLICATION USAGE section, add a new paragraph: When using uniq to process pathnames, it is recommended that LC_ALL, or at least LC_CTYPE and LC_COLLATE, are set to POSIX or C in the environment, since pathnames can contain byte sequences that do not form valid characters in some locales, in which case the utility's behavior would be undefined. In the POSIX locale each byte is a valid single-byte character, and therefore this problem is avoided. On Page: 3454 Line: 115933 Section: A.6.2 Character Encoding (2013 edition Page: 3483 Line: 117516) Add a new paragraph: Earlier versions of this standard did not state the requirement that the POSIX locale contains 256 single-byte characters. This was an oversight; the intention was always that the POSIX locale should have an 8-bit-clean single-byte encoding. We also looked at starting bugs 946, 962 and 958 but did not have enough time for them. Next Steps ---------- The next call is on July 9, 2015 (a Thursday) Calls are anchored on US time. (8am Pacific) This call will be for the regular 90 minutes. http://austingroupbugs.net An IRC channel will be available for the meeting irc://irc.freenode.net/austingroupbugs An etherpad is usually up for the meeting, with a URL using the date format as below: http://posix@posix.rhansen.org:9001/p/201x-mm-dd password=2115756#