Minutes of the 2nd July 2015 Teleconference     Austin-716 Page 1 of 1
Submitted by Andrew Josey, The Open Group.  3rd July 2015

Attendees:
    Andrew Josey, The Open Group
    Don Cragun, IEEE PASC OR
    Roger Faulkner, Oracle, The Open Group OR
    Joerg Schilling, FOKUS Fraunhofer
    Geoff Clare, The Open Group
    Jim Grisanzio, Oracle
    Martin Rehak, Oracle
    Nick Stoughton, USENIX, ISO/IEC JTC 1/SC 22 OR
    Mark Ziegast, SHware Systems
    Richard Hansen, BBN 
    David Clissold, IBM

Apologies
    Eric Blake, Red Hat

* General news

Andrew reported that he has worked with Geoff and Cathy to commence
setup of a build tree for the merged draft. A gating item will
be production of an updated TC draft - which is an action on Andrew.

Andrew took an action to notify Michael Kipness of our plans regarding
balloting.

At the moment the anticipated cutoff for new bugs is when the current
pending interpretations complete  - now around August 3rd.

Andrew still has the action to form the IEEE ballot group.

Andrew has an outstanding action to investigate the next steps with
ISO balloting.  Its usually a matter of the project editor submitting
the text to the secretariat and then requesting a ballot.

* Outstanding actions

( Please note that I have flushed this section to shorten the minutes -
to locate the last set of outstanding actions, look
to the minutes from 26 Feb 2015)

Bug 0000887: printf and other functions appear many times in search results OPEN
http://austingroupbugs.net/view.php?id=887
Andrew is investigating.

Bug 0000900: add qsort_r        OPEN
http://austingroupbugs.net/view.php?id=900

The consensus was that its a good idea to add the suggested interface.
The usual requirements regarding a sponsor for a new interface
apply.

Action: Open Group OR , to ask the Base WG if they wish to sponsor
the additional qsort interface proposed here.

Bug  0000901: reserve _POSIX* shell option namespace for future use        OPEN
http://austingroupbugs.net/view.php?id=901

The forward plan for this bug remains as before:

    Richard: file a new bug report with a concrete feature that
    would use the _POSIX* namespace (as motivation for reserving
    set -o _POSIX*)

    All: debate the proposed feature.  If it's something we want,
    then revisit bug #901.  If not, close bug #901.


Bug 0000922: Implementations should be allowed to change/remove implementation-defined environment variables         OPEN
http://austingroupbugs.net/view.php?id=922

This item remains open. 

Action on Eric: propose wording for Issue 8 to add secure_getenv(),
and make it clear that deleting from environment without explicit
request is not compliant, but ignoring is fine.

For Issue 7 TC 2: Create new bug to add additional conditions on
what makes TMPDIR valid, vs. undefined behavior; also add future
directions to getenv() to mention secure_getenv()

* Current Business

Bug 663: Specification of str[n]casecmp is ambiguous Accepted as marked
http://www.austingroupbugs.net/view.php?id=663

Geoff has added a note with a suggested new resolution

This item is tagged for TC2-2008
An interpretation is required

Interpretation response:
The standard is unclear on this issue, and no conformance distinction
can be made between alternative implementations based on this. This
is being referred to the sponsor.

Rationale:
The intention was always that the POSIX locale should have an
8-bit-clean single-byte encoding. The omission of an explicit
statement to that effect was an oversight.

We are also seeking proposals for the standardization of a "POSIX.UTF-8" locale for Issue 8.

Notes to the Editor (not part of this interpretation):

On Page: 128 Line: 3596 Section: 6.2 Character Encoding
(2013 edition Page: 128 Line: 3623)

Change from:

The POSIX locale contains the characters in [xref to Table 6-1],
which have the properties listed in [xref to 7.3.1]. In other
locales, the presence, meaning, and representation of any additional
characters are locale-specific.

to:

The POSIX locale shall contain 256 single-byte characters including
the characters in [xref to Table 6-1] and [xref to Table 6-2], which
have the properties listed in [xref to 7.3.1]. It is unspecified
whether characters not listed in those two tables are classified
as punct or cntrl, or neither. Other locales shall contain the
characters in [xref to Table 6-1] and may contain any or all of the
control characters identified in [xref to Table 6-2] that are not
included in [xref to Table 6-1]; the presence, meaning, and
representation of any additional characters are locale-specific.

[Note to the TC2 editors: the above is a layered change.]

On Page: 136 Line: 3849 Section: 7.2 POSIX locale
(2013 edition Page: 136 Line: 3885)

Delete:

The tables in Section 7.3 describe the characteristics and behavior
of the POSIX locale for data consisting entirely of characters from
the portable character set and the control character set. For other
characters, the behavior is unspecified.

On Page: 139 Line: 3996 Section: 7.3.1 LC_CTYPE
(2013 edition Page: 139 Line: 4032)

Change from:

In the POSIX locale, the 26 uppercase letters shall be included:

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

to:

In the POSIX locale, only:

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

shall be included.

On Page: 139 Line: 4003 Section: 7.3.1 LC_CTYPE
(2013 edition Page: 139 Line: 4039)

Change from:

In the POSIX locale, the 26 lowercase letters shall be included:

a b c d e f g h i j k l m n o p q r s t u v w x y z

to:

In the POSIX locale, only:

a b c d e f g h i j k l m n o p q r s t u v w x y z

shall be included.

On Page: 139 Line: 4009 Section: 7.3.1 LC_CTYPE
(2013 edition Page: 140 Line: 4045)

Change from:

In the POSIX locale, all characters in the classes upper and lower
shall be included.

to:

In the POSIX locale, only characters in the classes upper and lower
shall be included.

On Page: 141 Line: 4091 Section: 7.3.1 LC_CTYPE
(2013 edition Page: 141 Line: 4127)

Change from:

In the POSIX locale, at a minimum, the 26 lowercase characters:

to:

In the POSIX locale, the 26 lowercase characters:

On Page: 142 Line: 4105 Section: 7.3.1 LC_CTYPE
(2013 edition Page: 142 Line: 4141)

Change from:

In the POSIX locale, at a minimum, the 26 uppercase characters:

to:

In the POSIX locale, the 26 uppercase characters:

On Page: 143 Line: 4142 Section: 7.3.1.1 LC_CTYPE Category in the POSIX Locale
(2013 edition Page: 143 Line: 4178)

Change from:

The character classifications for the POSIX locale follow; the code
listing depicts the localedef input, and the table represents the
same information, sorted by character.

to:

The minimum character classifications for the POSIX locale follow;
the code listing depicts the localedef input, and the table represents
the same information, sorted by character. Implementations may add
additional characters to the cntrl and punct classifications but
shall not make any other additions.

On Page: 143 Line: 4145 Section: 7.3.1.1 LC_CTYPE Category in the POSIX Locale
(2013 edition Page: 143 Line: 4181)

Change from:

# The following is the POSIX locale LC_CTYPE.
# "alpha" is by default "upper" and "lower"
# "alnum" is by definition "alpha" and "digit"
# "print" is by default "alnum", "punct", and the <space>
# "graph" is by default "alnum" and "punct"

to:

# The following is the minimum POSIX locale LC_CTYPE.
# "alpha" is by definition "upper" and "lower"
# "alnum" is by definition "alpha" and "digit"
# "print" is by definition "alnum", "punct", and the <space>
# "graph" is by definition "alnum" and "punct"


On Page: 151 Line: 4531 Section: 7.3.2.6 LC_COLLATE Category in the POSIX Locale
(2013 edition Page: 151 Line: 4567)

Change from:

The collation sequence definition of the POSIX locale follows; the
code listing depicts the localedef input.

to:

The minimum collation sequence definition of the POSIX locale
follows; the code listing depicts the localedef input. All characters
not explicitly listed here shall be inserted in the character
collation order after the listed characters and shall be assigned
unique primary weights. If the listed characters have ASCII encoding,
the other characters shall be in ascending order according to their
coded character set values; otherwise, the order of the other
characters is unspecified. The collation sequence shall not include
any multi-character collating elements.

On Page: 151 Line: 4534 Section: 7.3.2.6 LC_COLLATE Category in the POSIX Locale
(2013 edition Page: 151 Line: 4570)

Change from:

# This is the POSIX locale definition for the LC_COLLATE category.
# The order is the same as in the ASCII codeset.

to:

# This is the minimum input for the POSIX locale definition for the
# LC_COLLATE category. Characters in this list are in the same order
# as in the ASCII codeset.


On Page: 355 Line: 11953 Section: <stdlib.h>
(2013 edition Page: 358 Line: 12042)

After:

{MB_CUR_MAX} Maximum number of bytes in a character specified by
the current locale (category LC_CTYPE).

add a new sentence:

[CX]In the POSIX locale the value of {MB_CUR_MAX} shall be 1.[/CX]

On Page: 622 Line: 21263 Section: btowc()
(2013 edition Page: 627 Line: 21451)

In the RETURN VALUE section, add a new sentence:

[CX]In the POSIX locale, btowc() shall not return WEOF if c has a
value in the range 0 to 255 inclusive.[/CX]

On Page: 1270 Line: 41775 Section: mblen()
(2013 edition Page: 1282 Line: 42472)

In the ERRORS section, change from:

[XSI][EILSEQ]

    An invalid character sequence is detected.[/XSI]

to:

[CX][EILSEQ]

    An invalid character sequence is detected. In the POSIX locale
    an EILSEQ error cannot occur since all byte values are valid
    characters.[/CX]


On Page: 1272 Line: 41825 Section: mbrlen()
(2013 edition Page: 1284 Line: 42526)

In the ERRORS section, change from:

[EILSEQ]

    An invalid character sequence is detected.

to:

[EILSEQ]

    An invalid character sequence is detected. [CX]In the POSIX
    locale an EILSEQ error cannot occur since all byte values are
    valid characters.[/CX]


On Page: 1275 Line: 41890 Section: mbrtowc()
(2013 edition Page: 1287 Line: 42594)

In the ERRORS section, change from:

[EILSEQ]

    An invalid character sequence is detected.

to:

[EILSEQ]

    An invalid character sequence is detected. [CX]In the POSIX
    locale an EILSEQ error cannot occur since all byte values are
    valid characters.[/CX]


On Page: 1278 Line: 41998 Section: mbsrtowcs()
(2013 edition Page: 1290 Line: 42706)

In the ERRORS section, change from:

[EILSEQ]

    An invalid character sequence is detected.

to:

[EILSEQ]

    An invalid character sequence is detected. [CX]In the POSIX
    locale an EILSEQ error cannot occur since all byte values are
    valid characters.[/CX]


On Page: 1279 Line: 42051 Section: mbstowcs()
(2013 edition Page: 1291 Line: 42760)

In the ERRORS section, change from:

[XSI][EILSEQ]

    An invalid byte sequence is detected.[/XSI]

to:

[CX][EILSEQ]

    An invalid character sequence is detected. In the POSIX locale
    an EILSEQ error cannot occur since all byte values are valid
    characters.[/CX]


On Page: 1281 Line: 42104 Section: mbtowc()
(2013 edition Page: 1293 Line: 42815)

In the ERRORS section, change from:

[XSI][EILSEQ]

    An invalid character sequence is detected.[/XSI]

to:

[CX][EILSEQ]

    An invalid character sequence is detected. In the POSIX locale
    an EILSEQ error cannot occur since all byte values are valid
    characters.[/CX]


On Page: 2455 Line: 78223 Section: awk
(2013 edition Page: 2478 Line: 79587)

In the APPLICATION USAGE section, add a new paragraph:

When using awk to process pathnames, it is recommended that LC_ALL,
or at least LC_CTYPE and LC_COLLATE, are set to POSIX or C in the
environment, since pathnames can contain byte sequences that do not
form valid characters in some locales, in which case the utility's
behavior would be undefined. In the POSIX locale each byte is a
valid single-byte character, and therefore this problem is avoided.

On Page: 2537 Line: 81424 Section: comm
(2013 edition Page: 2561 Line: 82825)

In the APPLICATION USAGE section, add a new paragraph:

When using comm to process pathnames, it is recommended that LC_ALL,
or at least LC_CTYPE and LC_COLLATE, are set to POSIX or C in the
environment, since pathnames can contain byte sequences that do not
form valid characters in some locales, in which case the utility's
behavior would be undefined. In the POSIX locale each byte is a
valid single-byte character, and therefore this problem is avoided.

On Page: 2786 Line: 90886 Section: grep
(2013 edition Page: 2810 Line: 92292)

In the APPLICATION USAGE section, add a new paragraph:

When using grep to process pathnames, it is recommended that LC_ALL,
or at least LC_CTYPE and LC_COLLATE, are set to POSIX or C in the
environment, since pathnames can contain byte sequences that do not
form valid characters in some locales, in which case the utility's
behavior would be undefined. In the POSIX locale each byte is a
valid single-byte character, and therefore this problem is avoided.

On Page: 2792 Line: 91089 Section: head
(2013 edition Page: 2816 Line: 92496)

In the APPLICATION USAGE section, change from:

None.

to:

When using head to process pathnames, it is recommended that LC_ALL,
or at least LC_CTYPE and LC_COLLATE, are set to POSIX or C in the
environment, since pathnames can contain byte sequences that do not
form valid characters in some locales, in which case the utility's
behavior would be undefined. In the POSIX locale each byte is a
valid single-byte character, and therefore this problem is avoided.

On Page: 2817 Line: 91965 Section: join
(2013 edition Page: 2841 Line: 93377)

In the APPLICATION USAGE section, add a new paragraph:

When using join to process pathnames, it is recommended that LC_ALL,
or at least LC_CTYPE and LC_COLLATE, are set to POSIX or C in the
environment, since pathnames can contain byte sequences that do not
form valid characters in some locales, in which case the utility's
behavior would be undefined. In the POSIX locale each byte is a
valid single-byte character, and therefore this problem is avoided.

On Page: 3159 Line: 105039 Section: sed
(2013 edition Page: 3185 Line: 106550)

In the APPLICATION USAGE section, add a new paragraph:

When using sed to process pathnames, it is recommended that LC_ALL,
or at least LC_CTYPE and LC_COLLATE, are set to POSIX or C in the
environment, since pathnames can contain byte sequences that do not
form valid characters in some locales, in which case the utility's
behavior would be undefined. In the POSIX locale each byte is a
valid single-byte character, and therefore this problem is avoided.

On Page: 3187 Line: 106180 Section: sort
(2013 edition Page: 3214 Line: 107719)

In the APPLICATION USAGE section, add a new paragraph:

When using sort to process pathnames, it is recommended that LC_ALL,
or at least LC_CTYPE and LC_COLLATE, are set to POSIX or C in the
environment, since pathnames can contain byte sequences that do not
form valid characters in some locales, in which case the utility's
behavior would be undefined. In the POSIX locale each byte is a
valid single-byte character, and therefore this problem is avoided.

On Page: 3214 Line: 107142 Section: tail
(2013 edition Page: 3241 Line: 108681)

In the APPLICATION USAGE section, add a new paragraph:

When using tail to process pathnames, and the -c option is not
specified, it is recommended that LC_ALL, or at least LC_CTYPE and
LC_COLLATE, are set to POSIX or C in the environment, since pathnames
can contain byte sequences that do not form valid characters in
some locales, in which case the utility's behavior would be undefined.
In the POSIX locale each byte is a valid single-byte character, and
therefore this problem is avoided.

On Page: 3250 Line: 108473 Section: tr
(2013 edition Page: 3277 Line: 110019)

In the RATIONALE section, delete:

This meant that historical practice of being able to specify tr
-cd\000-\177 (which would delete all bytes with the top bit set)
would have no effect because, in the C locale, bytes with the values
octal 200 to octal 377 are not characters.

On Page: 3283 Line: 109551 Section: uniq
(2013 edition Page: 3310 Line: 111099)

In the APPLICATION USAGE section, add a new paragraph:

When using uniq to process pathnames, it is recommended that LC_ALL,
or at least LC_CTYPE and LC_COLLATE, are set to POSIX or C in the
environment, since pathnames can contain byte sequences that do not
form valid characters in some locales, in which case the utility's
behavior would be undefined. In the POSIX locale each byte is a
valid single-byte character, and therefore this problem is avoided.

On Page: 3454 Line: 115933 Section: A.6.2 Character Encoding
(2013 edition Page: 3483 Line: 117516)

Add a new paragraph:

Earlier versions of this standard did not state the requirement
that the POSIX locale contains 256 single-byte characters. This was
an oversight; the intention was always that the POSIX locale should
have an 8-bit-clean single-byte encoding.


We also looked at starting bugs 946, 962 and 958 but did not have
enough time for them.

Next Steps
----------
The next call is on July 9, 2015 (a Thursday)

Calls are anchored on US time. (8am Pacific) 

This call will be for the regular 90 minutes.

http://austingroupbugs.net

An IRC channel will be available for the meeting
irc://irc.freenode.net/austingroupbugs

An etherpad is usually up for the meeting, with a URL using the date format as below:

http://posix@posix.rhansen.org:9001/p/201x-mm-dd
password=2115756#