Viewing Issue Advanced Details
248 [1003.1(2008)/Issue 7] Shell and Utilities Objection Error 2010-04-30 20:20 2023-11-21 09:49
dwheeler
ajosey  
normal  
Applied  
Accepted As Marked  
   
David A. Wheeler
Shell & Utilities
2789,3003,3260,3385,and more
91004,99054-99063,108778,113329-113334,and more
---
Note: 0006560
Fix the numerous filename/pathname processing errors in specification examples
Many examples in the specification fail, sometimes spectacularly, when given filenames or pathnames permitted by the specification. The specification currently permits filenames to include newline, return, tab, escape, leading "-", double-quotes, single-quotes, ampersand, asterisk, and other "interesting" characters. However, many of the specification's examples will fail to work correctly when such filenames are present.

In a few cases, there's a note that the example presumes that some of these will not occur, but there is no standard way to *ensure* that such filenames cannot occur. In addition, many attackers have learned to *intentionally* create such filenames, since there is no way to prevent them in general. Thus, multi-user systems, or systems that receive external data (e.g., via USB sticks and network drives) are still at risk if they used these examples. This incorrect filename handling can be the basis of security problems, as noted in:
 http://www.dwheeler.com/essays/fixing-unix-linux-filenames.html [^]
It's worth noting that this is not a new problem. Line 82265 noted that the specification has been changed before "to remove the unreliable use of find | xargs".


The examples will need to be examined for patterns such as these:

# WRONG. Fails if newlines/tabs/space/backslash/apostrophe/double-quote/
# ampersand in filenames:
find . -type f | xargs COMMAND

# WRONG. Fails if a file in the current directory begins with "-" or
# if there are no files in the current directory:
COMMAND *
for file in * ; do
 COMMAND "$file"
done



Please either:
1. Fix the various errors. Below are some of them, to get started. It's concerning that even the experts who developed this specification (and they're definitely experts!) made these mistakes. I believe this shows that it is too difficult to correctly use systems while conforming to this specification.
2. Change the rules on filenames to forbid certain filenames. In particular, forbidding control characters in filenames (at least bytes 1-31 and maybe 127) would completely eliminate a number of the problems. In addition, forbidding leading "-" would solve some more (though not as many as the control-characters one).

Page 2789, Line 91004:
Change:
 find . −type f | xargs hash
To:
 find . -type f -exec hash {} \;
Rationale:
 This example tries to explain why hash won't work in this case,
 but the example should only have *that* error, not *other* errors as well.

Page 3003, line 99054-99063:
Remove this entire example.
I believe it is not possible to do what this example is attempting to do
while using standard POSIX utilities, at least by using pax.
This example attempts to determine if the filenames in a pax archive
are valid, but filenames can include newline, and pax does not have any
way to disambiguate these cases (e.g., by having a -0 option) in its output.
POSIX tools simply aren't capable of doing accurate filename checking
in this case, at least not without re-implementing (say, in C) a significant
portion of pax.

Page 3260, line 108778:
Replace:
  find . −type f | xargs type
with:
  find . −type f -exec type {} \;

Page 3385, lines 113329-113334:
Remove this example, or change it to work properly
(and that may not be easy).
Although it documents that it will not work if
filenames contain newline, there's no way to be guaranteed of that.
It also presumes that the filenames don't have {}, which isn't guaranteed either. If this was executed on a system where a filename DID contain a newline,
or where the source/target directories contained {},
*disaster* could ensue, and examples should not recommend
techniques that might be disastrous.
Most of the other examples on page 3385 should arguably be
removed as well, since there's no mechanism for guaranteeing that
filenames do not include newline (if xargs -0 is added, then they
can probably be easily repaired and kept).

Notes
(0006560)
geoffclare   
2023-10-30 16:34   
Page and line numbers are for Issue 8 draft 3.

Page 2973, Line 99511 (hash APPLICATION USAGE):
Change:
find . −type f | xargs hash
To:
find . -type f -exec hash {} +


Page 3441, line 117538 (type APPLICATION USAGE):
Replace:
find . −type f | xargs type
with:
find . −type f -exec type {} +


On page 3568 line 122228 (xargs EXAMPLES) change example 2 from:
The following command invokes diff with successive pairs of arguments originally typed as command line arguments. It assumes there are no embedded <newline> characters in the elements of the original argument list.
printf "%s\n" "$@" | sed 's/[^[:alnum:]]/\\&/g' |
xargs -E "" -n 2 -x diff
to:
The following command invokes diff with successive pairs of arguments originally typed as command line arguments.
printf "%s\0" "$@" | xargs -0 -n 2 -x diff --


On page 3568 line 122233 change example 3 from:
In the following commands, the user is asked which files in the current directory (excluding dotfiles) are to be archived. The files are archived into arch; a, one at a time or b, many at a time. The commands assume that no filenames contain <blank>, <newline>, <backslash>, <apostrophe>, or double-quote characters.
a. ls | xargs -E "" -p -L 1 ar -r arch
b. ls | xargs -E "" -p -L 1 | xargs -E "" ar -r arch
to:
In the following command, the user is asked which regular files below the current directory are to be archived.
find . -type f -print0 | xargs -0 -p -L 1 ar -r arch


On page 3568, lines 122242-122247 remove example 5




Viewing Issue Advanced Details
251 [1003.1(2008)/Issue 7] Base Definitions and Headers Objection Enhancement Request 2010-05-03 18:49 2023-11-21 10:28
dwheeler
ajosey  
normal  
Applied  
Accepted As Marked  
   
David A. Wheeler
XBD 3.170 Filename
60
1781
---
See Note: 0006561.
Forbid newline, or even bytes 1 through 31 (inclusive), in filenames
Forbid bytes 1 through 31 (inclusive) in filenames.

POSIX.1-2008 page 60 lines 1781-1786 states that filenames (aka "pathname component") may contain all characters except <slash> and the null byte, and this has historically been true. However, this excessive permissiveness has resulted in numerous security vulnerabilities and erroneous programs. It also increases the effort to write correct programs, because correctly processing filenames that include characters like newline is very difficult (even the expert POSIX developers have trouble; see 0000248). The "Unix-Haters Handbook" specifically notes the problems caused by control characters (such as newlines) in filenames (see page 156-157), so this is not a new problem!

A key offender, of course, is the <newline> character. This is widely used as
a filename separator, even though it is strictly speaking not a valid filename separator (since a filename may include it). But other control characters also cause problems. Another common problematic character is the <tab> character; files with records terminating in newline, and fields separated by tab, are extremely common, and are encouraged by some tools (e.g., by the default delimiter of "cut" and "paste"). Some terminals and terminal emulators accept control characters (range 1-31), e.g., via the escape character. Simply *displaying* filenames with bytecodes in this range can cause problems on such systems. (Granted, there is no requirement that terminals must only accept control characters in the range 1-31, but if 1-31 could not be in filenames, that would be a reasonable configuration to move to.)

In practice, the primary use of filenames with control characters appears to be to enable security vulnerabilities and to cause errors. Other than "we've always done it that way", there seems to be little justification for them. By forbidding control characters, many programs that are currently erroneous (e.g., because they process filenames one line at a time) become correct. The number of applications this change would *fix* is far larger than the vanishingly small number of programs that non-portably *depend* on control characters being permitted in filenames.

Any program that depends on control characters in filenames is *already* not portable,
since control characters are not in the "portable character set" of filenames
(Page 77, line 2194, XBD 3.276 Portable Filename Character Set).
What's more, NTFS (and probably other filesystems) already forbid bytes 1-31 in filenames,
so this is already an extant limitation. But merely making it "not portable" is not enough;
it needs to be *forbidden* before correct programs can count on it.

This proposal suggests a new error, [ENAME], though other options are possible.

I'm well aware that this may be a controversial proposal. But I think most readers will understand *why* I am proposing this, and that while this is a dramatic approach, we are now in an era where POSIX systems are routinely used to manage important information worldwide. The ability to include control characters in filenames has rarely been a help, and instead has been a hindrance to the use of these systems. We've had ample time to see that this is a problem. It's time to jettison this misfeature.


Vol. 1, Page 60, line 1782-1784:
Change:
 "The characters composing the name may be selected
  from the set of all character values excluding the <slash> character
  and the null byte."
to:
 "The characters composing the name may be selected
  from the set of all character values excluding the <slash> character,
  the null byte, and character values 1 through 31 inclusive.
  Attempts to create filenames containing bytes 1 though 31 must be rejected,
  and conforming implementations
  must not return such filenames to applications or users."


Vol. 1, Page 77, append after 2199:
The set of character values 1 through 31, inclusive, are expressly forbidden.
 Attempts to create such filenames must be rejected, and conforming implementations
 must not return such filenames to applications or users.

Vol. 2, page 480,
[ENAME]
 The path name is not a permitted name, e.g., it contains bytes 1 through 31 (inclusive).


Vol. 2, page 1382, in "open()", after line 45297, state:
"If open() or openat() is passed a path containing byte 1 through 31 (inclusive),
it must reply with error ENAME instead of opening the file."

Vol.2, page 1382, in open(), before line 45319, state:
[ENAME] The path name is not a permitted name, e.g., it contains bytes 1 through 31 (inclusive).

Vol. 2, page 1744, lines 55680-55682: Append:
"The readdir( ) function shall not return directory entries of names containing bytes 1 through 31, inclusive."

There would need to be other changes in the interfaces, but there's no point in identifying these other changes if this proposal will be flatly rejected. Since this involves filenames, which occur all over in the spec, it may be possible to have this change described in a few places instead of many. I've identified open() and readdir() here, because many other interfaces are built from these two.

Notes
(0006561)
Don Cragun   
2023-10-30 17:24   
(edited on: 2023-10-31 16:50)
Proposed changes (All page and line numbers refer to Issue 8 Draft 3) to be applied after the changes for 0000253 have been applied:
Change P266 L23296 (XSH bind rationale) from:
None.
to:
Implementions are encouraged to have bind() report an [EILSEQ] error if the last component of the address to be bound to an AF_UNIX family socket contains any bytes that have the encoded value of a <newline> character.

Move the [EEXIST] error on P1027, L35244-35245 (XSH fopen errors) before the [EILSEQ] error on P1027, L35241-35243 to put the errors in alphabetical order.
Add a new paragraph after P977, L33357 (XSH fopen rationale):
Implementations are encouraged to have fopen() and freopen() report an [EILSEQ] error if mode begins with 'w' or 'a', the file did not previously exist, and the last component of pathname contains any bytes that have the encoded value of a <newline> character.

Add a new paragraph after P1330, L44792 (XSH link rationale):
Implementations are encouraged to have link() and linkat() report an [EILSEQ] error if the file named by path2 did not previously exist, and the last component of that pathname contains any bytes that have the encoded value of a <newline> character.

Add a new paragraph after P1407, L47277 (XSH mkdir rationale):
Implementations are encouraged to have mkdir() and mkdirat() report an [EILSEQ] error if the last component of path contains any bytes that have the encoded value of a <newline> character.

Add a new paragraph after P1411, L47406 (XSH mkdtemp rationale):
Implementations are encouraged to have mkdtemp(), mkostemp() and mkstemp() report an [EILSEQ] error if the last component of the pathname in template contains any bytes that have the encoded value of a <newline> character.

Add a new paragraph after P1414, L47532 (XSH mkfifo rationale):
Implementations are encouraged to have mkfifo() and mkfifoat() report an [EILSEQ] error if the last component of path contains any bytes that have the encoded value of a <newline> character.

Add a new paragraph after P1419, L47699 (XSH mknod rationale):
Implementations are encouraged to have mknod() and mknodat() report an [EILSEQ] error if the last component of path contains any bytes that have the encoded value of a <newline> character.

Add a new paragraph after P1514, L50793 (XSH open rationale):
Implementations are encouraged to have open() and openat() report an [EILSEQ] error if oflag contains O_CREAT, the file did not previously exist, and the last component of path contains any bytes that have the encoded value of a <newline> character.

Add a new paragraph after P1891, L62567 (XSH rename rationale):
Implementations are encouraged to have rename() and renameat() report an [EILSEQ] error if the file named by new does not already exist and the last component of that pathname contains any bytes that have the encoded value of a <newline> character.

Add a new paragraph after P2183, L71316 (XSH symlink rationale):
Implementations are encouraged to have symlink() and symlinkat() report an [EILSEQ] error if the last component of path2 contains any bytes that have the encoded value of a <newline> character.

After P2454 L79567 XCU section 1.4 (Utility Description Defaults: CONSEQUENCES OF ERRORS), add to the first bullet item:
<small>Note: If the requested action is to write one or more pathnames in a format that has <newline> as a terminator or separator, and a pathname to be written contains any bytes that have the encoded value of a <newline> character, this should be treated as an action that cannot be performed. A future version of this standard may require that utilities treat this as an error.</small>


Start of editor's notes for changes below to XCU:
Replace each occurrence of the string "PARAGRAPH DELIM" with the paragraph:
If this utility is directed to display a pathname that contains any bytes that have the encoded value of a <newline> character when <newline> is a terminator or separator in the output format being used, implementations are encouraged to treat this as an error. A future version of this standard may require implementations to treat this as an error.

Replace each occurrence of the string "PARAGRAPH DIRENT" with the paragraph:
If this utility is directed to create a new directory entry that contains any bytes that have the encoded value of a <newline> character, implementations are encouraged to treat this as an error. A future version of this standard may require implementations to treat this as an error.

End of editor's notes.

Change P2562, L83722 section admin FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DELIM
PARAGRAPH DIRENT

Change P2573, L84133 section ar FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DELIM
PARAGRAPH DIRENT

Add after P2624, Lx86260 section awk FUTURE DIRECTIONS:
PARAGRAPH DIRENT

Change P2629, L86431 section basename FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DELIM

Add after P2664, L87886 section c17 FUTURE DIRECTIONS:
PARAGRAPH DELIM
PARAGRAPH DIRENT

Change P2678, L88346 section cd FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DELIM

Change P2701, L89252 section cksum FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DELIM

Add after P2705, L89390 section cmp FUTURE DIRECTIONS:
None.
to:
PARAGRAPH DELIM

Change P2715, L89771 section command FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DELIM

Add after P2721, L89999 section compress FUTURE DIRECTIONS:
PARAGRAPH DIRENT

Change P2730, L90318 section cp FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DIRENT

Change P2737, L90620 section csplit FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DIRENT

Change P2742, L90803 section ctags FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DELIM
PARAGRAPH DIRENT

Change P2750, L91075 section cxref FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DELIM

Add after P2767, L91587 section dd FUTURE DIRECTIONS:
PARAGRAPH DIRENT

Change P2771, L91742 section delta FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DELIM
PARAGRAPH DIRENT

Change P2775, L91898 section df FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DELIM

Change P2784, L92243 section diff FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DELIM

Change P2787, L92359 section dirname FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DELIM

Change P2791, L92498 section du FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DELIM

Change P2796, L92681 section ed OPERANDS from:
file If the file argument is given, ed shall simulate an e command on the file named by the pathname, file, before accepting commands from the standard input.
to:
file If the file argument is given, ed shall perform the effect of an e command on the pathname file before accepting commands from the standard input, except that file can contain a <newline>, even though this is not possible for the argument to the e command.

Change P2811, L93294 section ed FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DIRENT

Change P2888, L96380 section ex FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DIRENT

Change P2916, L97389 section file FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DELIM

Change P2926, L97857 section find FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DELIM

Change P2935, L98138 section fuser FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DELIM

Change P2947, L98555 section get FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DELIM
PARAGRAPH DIRENT

Change P2970, L99423 section grep FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DELIM

Change P2974, L99526 section hash FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DELIM

Change P2977, L99637 section head FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DELIM

Change P2994, L100247 section ipcs FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DELIM

Change P3025, L101427 section link FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DIRENT

Change P3029, L101596 section ln FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DIRENT

Change P3040, L102002 section localedef FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DIRENT

Add a new paragraph after P3053, L102472 section ls OPTIONS -C description:
<small>Note: Since the output from this option may use separator characters that include characters that might appear in filenames (in addition to the problems related to <newline>s in filenames), -C should not be used when filenames might be extracted from the output by a script.</small>

Change P3055, L102530-102531 section ls OPTIONS -q description from:
Force each instance of non-printable filename characters and <tab> characters to be written as the <question-mark> ('?') character.
to:
Force each instance of non-printable filename characters (including <newline>, <tab>, and other control characters) to be written as the <question-mark> ('?') character.

Add after P3062, L102840 section ls FUTURE DIRECTIONS:
PARAGRAPH DELIM

Change P3072, L103313 section m4 FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DELIM

Change P3101, L104386 section mailx FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DIRENT

Add after P3133, L105820 section make FUTURE DIRECTIONS:
PARAGRAPH DIRENT

Change P3140, L106025 section man FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DELIM

Change P3146, L106228 section mkdir FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DIRENT

Change P3148, L106320 section mkfifo FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DIRENT

Change P3169, L107117 section msgfmt FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DIRENT

Change P3174, L107320 section mv FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DIRENT

Change P3192, L107932 section nm FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DELIM

Change P3216, L108834 section patch FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DIRENT

Change P3259, L110622 section pax FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DELIM
PARAGRAPH DIRENT

Add a new paragraph after P3266, L110876, section pr APPLICATION USAGE:
If a file operand contains <newline>, <form-feed>, or <vertical-tab> characters, or is overly long, and the pr utility is instructed to include the name of that file in the header, pagination may not be handled correctly. Applications can guard against this by using the -h option (for example, passing a sanitized, truncated form of the pathname with -h).

Change P3280, L111438 section prs FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DELIM

Change P3289, L111832 section pwd FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DELIM

Change P3299, L112192 section realpath FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DELIM

Change P3308, L112527 section rm FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DELIM

Change P3310, L112621 section rmdel FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DIRENT

Change P3315, L112792 section sact FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DELIM

Change P3350, L114226 section sh FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DIRENT

Change P3361, L114651 section sort FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DIRENT

Change P3365, L114788 section split FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DIRENT

Change P3396, L115906 section tee FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DIRENT

Change P3421, L116838 section touch FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DIRENT

Change P3441, L117545 section type FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DELIM

Change P3461, L118235 section unget FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DIRENT

Change P3466, L118382 section uniq FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DIRENT

Change P3471, L118593 section uucp FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DIRENT

Change P3475, L118703 section uudecode FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DIRENT

Change P3490, L119265 section val FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DELIM

Change P3544, L121335 section vi FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DIRENT

Change P3553, L121654 section wc FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DELIM

Change P3556, L121752 section what FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DELIM

Add after P3575, L122506 section xgettext FUTURE DIRECTIONS:
PARAGRAPH DIRENT

Change P3594, L123258 section yacc FUTURE DIRECTIONS from:
None.
to:
PARAGRAPH DIRENT






Viewing Issue Advanced Details
657 [1003.1(2008)/Issue 7] System Interfaces Objection Clarification Requested 2013-02-08 22:46 2023-11-21 10:35
philip-guenther
ajosey  
normal  
Applied  
Accepted As Marked  
   
Philip Guenther
OpenBSD
fmemopen
867
28775
Approved
Note: 0006535
Conditions under which fmemopen() write a NUL to the buffer are insufficiently specified
As updated by XSH/TC1/D3/0149, the fmemopen() description now states:

 When a stream open for writing is flushed or closed, a null byte shall
 be written at the current position or at the end of the buffer, depending
 on the size of the contents. If a stream open for update is flushed or
 closed and the last write has advanced the current buffer size, a null
 byte shall be written at the end of the buffer if it fits.

The first sentence does not specify _how_ the choice of where to write the NUL depends on the size of the contents. Therefore, an implementation is presumably conforming if it writes the NUL to the current position whenever the size of the contents is an even number, and to the end of the buffer when it is odd, for example.

The second sentence only indicates that a NUL is (required to be) written if the last write advanced the current buffer size. So, a program that does a write which advanced the buffer size, then seeks back and does another write, and _then_ flushes or closes it cannot depend on a NUL to be written.
Provide an actual specification for when and where the NUL is written in the "open for writing" case, presumably something about the minimum of the two offsets.

Change the second sentence to match what seems to be the glibc behavior:
 If a stream open for update is flushed or closed and the current buffer
 size has been advanced by a write since the stream was opened or flushed,
 a null byte shall be written at the end of the buffer if it fits.

Notes
(0006535)
geoffclare   
2023-10-16 16:06   
(edited on: 2023-10-16 16:09)
Interpretation response
------------------------
The standard is unclear on this issue, and no conformance distinction can be made between alternative implementations based on this. This is being referred to the sponsor.

Rationale:
-------------
None.

Notes to the Editor (not part of this interpretation):
-------------------------------------------------------

Page and line numbers are for Issue 8 draft 3.

Change "size" (when talking about the argument) to "max_size" throughout.
Change "size" (when talking about the last position in the buffer) to "end position" throughout.

Change on P961, L32741:
Open the stream for update (reading and writing). Truncate the buffer contents.
to:
Open the stream for update (reading and writing).

Change P961, L32744-32745 from:
If the mode argument includes 'b', then the stream shall be in binary mode; otherwise the stream shall be in text mode.
to:
If the mode argument begins with 'w' and max_size is not zero, the buffer contents shall be truncated by writing a null byte at the beginning. If the mode argument includes 'b', the results are implementation-defined.

Change on P961, L32758-32762:
The stream shall also maintain the size of the current buffer contents; use of fseek() or fseeko() on the stream with SEEK_END shall seek relative to this size. If mode starts with 'r' or includes 'b', the size shall be set to the value given by the size argument and shall not change. Otherwise, the stream is in text mode and writable, and the size shall be variable; for modes w and w+ the initial size shall be zero and for modes a and a+ the initial size shall be:
to:
The stream shall also maintain the end position of the current buffer contents; use of fseek() or fseeko() on the stream with SEEK_END shall seek relative to this end position. If mode starts with 'r' the end position shall be set to the value given by the max_size argument and shall not change. Otherwise, the stream is writable and the end position shall be variable; for modes w and w+ the initial end position shall be zero and for modes a and a+ the initial end position shall be:

Change on P962, L32775-32776:
When a stream open for writing in text mode is flushed or closed, a null byte shall be written at the current position or at the end of the buffer, depending on the size of the contents. If a stream open for update in text mode is flushed or closed and the last write has advanced the current buffer size, a null byte shall be written at the end of the buffer if it fits. If a stream is opened in binary mode, no additional null byte shall be written.
to:
When a stream open for update (the mode argument includes '+') or for writing only is successfully written and the write advances the current buffer end position, a null byte shall be written at the new buffer end position if it fits.

Change on P963, L32822-32828:
Unlike fopen(), where a 'b' in the mode argument is required to have no effect, fmemopen() distinguishes between text and binary modes. Text mode guarantees that the underlying memory will always be null terminated after any write operation, and tracks the growth of the largest position written to up to that point; while binary mode only modifies the underlying buffer according to direct writes, while seeking relative to the full buffer size. The combination of append and binary modes is not commonly used, since any attempt to write to such a stream will necessarily fail because the stream does not dynamically grow beyond the initial size.
to:
Implementations differ as regards how a 'b' in the mode argument affects the behavior. For some the 'b' has no effect, as is required for fopen(); others distinguish between text and binary modes.

Note that buf will not be null terminated if max_size bytes are written to the memory stream. Applications wanting to guarantee that the buffer will be null terminated need to call fmemopen() with max_size set to one byte smaller than the actual size of buf and set buf[max_size] to a null byte.






Viewing Issue Advanced Details
689 [1003.1(2008)/Issue 7] System Interfaces Editorial Clarification Requested 2013-05-05 15:37 2023-10-10 09:13
dalias
ajosey  
normal  
Applied  
Accepted As Marked  
   
Rich Felker
musl libc
XSH 2.5
unknown
unknown
Approved
Note: 0006428
Possibly unintended allowance for stdio deadlock
XSH 2.5 paragraph 2 reads:

"When a stream is "unbuffered", bytes are intended to appear from the source or at the destination as soon as possible; otherwise, bytes may be accumulated and transmitted as a block. When a stream is "fully buffered", bytes are intended to be transmitted as a block when a buffer is filled. When a stream is "line buffered", bytes are intended to be transmitted as a block when a <newline> byte is encountered. Furthermore, bytes are intended to be transmitted as a block when a buffer is filled, when input is requested on an unbuffered stream, or when input is requested on a line-buffered stream that requires the transmission of bytes. Support for these characteristics is implementation-defined, and may be affected via setbuf() and setvbuf()."

This intent, albeit implementation-defined and thus not in itself normative, reflects the traditional practice of having stdio input functions flush line-buffered streams, to accommodate lazy programming practices such as:

printf("Prompt: ");
scanf("%d", &x);

with no intervening fflush.

Unfortunately, encouraging or even permitting reads to flush all line-buffered output streams has some heavy locking consequences for multithreaded programs, including the possibility of deadlock. If thread A is holding a lock on a line-buffered output stream while waiting for a result from thread B, and thread B happens to use stdio for reading any file as part of its operation (unrelated to thread A's use of the line-buffered stream), the program will deadlock. This behavior seems highly undesirable and unintended.
My understanding is that a definition of "multithreaded program" is being added to the standard, with the intent that certain legacy implementation practices (like the alarm-based sleep implementation) that are incompatible or problematic for multithreaded programs can be ruled out for multithreaded programs while still allowing them in singlethreaded programs. If so, I think it would make sense to make use of that here, by adding text along the lines of:

"In a multithreaded program, performing an operation on a stream shall not cause any other open stream to be locked as if by flockfile. In particular, performing an input operation which results in bytes being transferred shall not cause line-buffered streams to be flushed in a multithreaded program."
Notes
(0006428)
geoffclare   
2023-08-10 16:29   
Interpretation response
------------------------

The standard does not speak to this issue, and as such no conformance distinction can be made between alternative implementations based on this. This is being referred to the sponsor.

Rationale:
-------------
There are possible deadlocks in multi-threaded applications that were not considered when POSIX added support for threading.

Notes to the Editor (not part of this interpretation):
-------------------------------------------------------
After Issue 8 draft 3 page 520 line 18489 section 2.5:
All functions that read, write, position, or query the position of a stream, except those with names ending _unlocked, shall lock the stream as if by a call to flockfile( ) before accessing it and release the lock as if by a call to funlockfile( ) when the access is complete.
add:
[CX]If the lock is not immediately available, the function shall wait for it to become available, except in the following circumstances. If the stream is line buffered and is open for writing or for update, and the reason the function is attempting to lock the stream is because it is going to request input on another stream that is unbuffered, or is line buffered and requires the transmission of characters from the host environment (see above), then the function shall attempt to determine whether a deadlock situation exists. If a deadlock situation is found to exist, the function shall fail. If the function is able to establish that a deadlock situation does not exist, it shall wait for the lock to become available. If the function does not establish whether or not a deadlock situation exists, it shall continue as if it had already locked the stream, found its buffer to be empty, and released the lock.[/CX]




Viewing Issue Advanced Details
700 [1003.1(2008)/Issue 7] System Interfaces Comment Clarification Requested 2013-05-19 18:12 2023-05-16 10:38
fbauzac
ajosey  
normal  
Applied  
Accepted As Marked  
   
Fabrice Bauzac
strtoul
na
na
---
Note: 0006256
Clarify strtoul's behaviour on strings representing negative numbers
In which cases does strtoul (and other strtou* functions) fail with ERANGE? How is the range check performed?

This ticket follows the discussion about strtoul on austin-group-l in May 2013.

The RETURN VALUE section in
http://pubs.opengroup.org/onlinepubs/009696799/functions/strtoul.html [^]
says:
If the correct value is outside the range of representable values, {ULONG_MAX} or {ULLONG_MAX} shall be returned and errno set to [ERANGE].

Some people understand this as "if the value I read (the correct value) is outside the [0, ULONG_MAX] range, then strtoul fails with ERANGE".

However, it seems that many implementations do the following:
1. Read independently the optional sign and the subject [0-9]+ sequence
2. If the subject sequence fits in the unsigned long type, then store it in an unsigned long variable. Otherwise (outside the [0, ULONG_MAX] range), fail with ERANGE.
3. If there was a minus sign, then apply negation on the unsigned long variable as if it were a signed long.

According to the discussion on the austin-group-l mailing list, it looks like this is what the POSIX standard (and the C standard) intend to specify. But it looks like it is not clear enough in the specification.
In section "RETURN VALUE", replace
- If the correct value is outside the range of representable values
with either of:
+ If the nonnegated value is outside the range of representable values
or
+ If the unnegated value is outside the range of representable values
or
+ If the value before potential negation is outside the range of representable values
or
+ If the absolute value is outside the range of representable values
or something similar
Notes
(0006256)
geoffclare   
2023-04-06 16:30   
On 2018 edition page 2081 line 66795 section strtol(), and
page 2277 line 72497 section wcstol(), change:
the value resulting from the conversion shall be negated.
to:
the resulting value shall be the negative of the converted value.

On 2018 edition page 2086 line 66909 section strtoul(), and
page 2284 line 72661 section wcstoul(), change:
the value resulting from the conversion shall be negated.
to:
the resulting value shall be the negative of the converted value; this action shall be performed in the return type.




Viewing Issue Advanced Details
708 [1003.1(2013)/Issue7+TC1] System Interfaces Editorial Enhancement Request 2013-06-07 21:23 2023-09-04 10:15
dalias
 
normal  
Applied  
Accepted As Marked  
   
Rich Felker
musl libc
XSH 2.9.1 Thread-Safety
unknown
unknown
---
Note: 0006433
Make mblen, mbtowc, and wctomb thread-safe for alignment with C11
Per C11 7.1.4 paragraph 5,

"Unless explicitly stated otherwise in the detailed descriptions that follow, library functions shall prevent data races as follows: A library function shall not directly or indirectly access objects accessible by threads other than the current thread unless the objects are accessed directly or indirectly via the function's arguments. A library function shall not directly or indirectly modify objects accessible by threads other than the current thread unless the objects are accessed directly or indirectly via the function's non-const arguments. Implementations may share their own internal objects between threads if the objects are not visible to users and are protected against data races."

7.22.7 (Multibyte/wide character conversion functions) does not specify that these functions are not required to avoid data races with other calls. The only time they would even potentially be subject to data races is for state-dependent encodings, which are all but obsolete; for single-byte or modern multi-byte (i.e. UTF-8) encodings, these functions are pure.

Note that 7.29.6.3 (Restartable multibyte/wide character conversion functions) does make exceptions that the "r" versions of these functions are not required to avoid data races when the state argument is NULL.
Remove mblen, mbtowc, and wctomb from the list of functions which are not required to be thread-safe.
Notes
(0006433)
geoffclare   
2023-08-14 16:25   
In all of these changes the placeholder XXXX should be replaced with the name of the function. Page and line numbers are for Issue 8 draft 2.1.

On page 1274 line 42635 section mblen(), and
page 1285 line 42996 section mbtowc(), and
page 2255 line 72488 section wctomb(), change:
The functionality described on this reference page is aligned with the ISO C standard. Any conflict between the requirements described here and the ISO C standard is unintentional. This volume of POSIX.1-202x defers to the ISO C standard.
to:
Except for requirements relating to data races, the functionality described on this reference page is aligned with the ISO C standard. Any other conflict between the requirements described here and the ISO C standard is unintentional. This volume of POSIX.1-202x defers to the ISO C standard for all XXXX() functionality except in relation to data races.

On page 1274 line 42652 section mblen(), and
page 1285 line 43015 section mbtowc(), and
page 2255 line 72505 section wctomb(), change:
[CX]The XXXX() function need not be thread-safe.[/CX]
to:
The XXXX() function [CX]need not be thread-safe; however, it[/CX] shall avoid data races with all other functions.

On page 1274 line 42669 section mblen(), and
page 1286 line 43032 section mbtowc(), and
page 2255 line 72522 section wctomb(), change RATIONALE from "None" to:
When the ISO C standard introduced threads in C11, it required XXXX() to avoid data races (with itself as well as with other functions), whereas POSIX.1-2008 did not require it to be thread-safe, and in many implementations it did not avoid data races with itself and still does not. The ISO C committee intend to change the requirements in a future version of the ISO C standard, but since POSIX.1 currently refers to C17 it is necessary for it not to defer to the ISO C standard regarding data races in order to continue to allow this function not to avoid data races with itself.

On page 1274 line 42669 section mblen(), and
page 1286 line 43034 section mbtowc(), and
page 2256 line 72524 section wctomb(), change FUTURE DIRECTIONS from "None" to:
It is expected that a change in a future version of the ISO C standard will allow a future version of this standard to remove the data race exception from the statement that it defers to the ISO C standard.




Viewing Issue Advanced Details
728 [1003.1(2013)/Issue7+TC1] System Interfaces Editorial Clarification Requested 2013-08-05 15:16 2023-09-04 10:21
dalias
 
normal  
Applied  
Accepted As Marked  
   
Rich Felker
musl libc
XSH 2.4.3 Signal Actions
unknown
unknown
---
Note: 0006430
Restrictions on signal handlers are both excessive and insufficient
Per XSH 2.4.3:

"the behavior is undefined if the signal handler refers to any object other than errno with static storage duration other than by assigning a value to an object declared as volatile sig_atomic_t, or if the signal handler calls any function defined in this standard other than one of the functions listed in the following table."

The intent here is that signal handlers cannot access objects which might be in a partially-modified state when the (asynchronous) signal handler is invoked. However, this is not what it says. Consider for example a program which allocates an object via malloc (or with automatic storage in main()) and stores the address in /tmp/foo. Per the language of the standard, the signal handler can legitimately open /tmp/foo (open is AS-safe), read that address (read is AS-safe), and dereference the pointer, even though the object may be in a partially-modified state.

As a second example, one can arrange for the address of such an object to be delivered to the signal handler via the sigval argument to realtime signals, timers, etc. and in fact using these features generally REQUIRES a pointer to be delivered to the signal handler.

Moreover, plenty of legitimate accesses to objects with static storage duration is wrongly forbidden:

- Access to const-qualified objects.
- Access to string literals (note in the above example, a string literal could not be passed to open; instead, the array "/dev/tmp" must be automatic).
- Access to a modifiable object of static storage duration whose last modification was sequenced before the signal handler could have been invoked via sigprocmask(), pthread_sigmask(), sigaction(), etc.

There is no reason to forbid such accesses, and to my knowledge there is no historical implementation for which they would not work.

This issue report is partly inspired by: http://www.tedunangst.com/flak/post/signal-safe-strcpy [^]
Ideally, replace the restriction on access to static objects in XSH 2.4.3 with a proper memory model for access to objects (static or otherwise) from signal handlers based on sequencing by signal masking or other means.

Alternatively, require that no object except objects with automatic storage duration whose lifetimes began within the signal handling context be accessed from within the signal handling context. In my opinion, this would be excessively restrictive, but at least it would close the loophole/inconsistency in the existing language where access to static objects is forbidden but access to dynamic or automatic objects that exist outside the signal handler is permitted.
Notes
(0006430)
geoffclare   
2023-08-14 15:57   
On Issue 8 draft 3 page 516 line 18330 section 2.4.3, change:
the behavior is undefined if the signal handler refers to any object other than errno with static or thread storage duration that is not a lock-free atomic object, other than by assigning a value to an object declared as volatile sig_atomic_t, or if the signal handler calls any function or function-like macro defined in this standard other than one of the functions and macros specified below as being async-signal-safe.
to:
the behavior is undefined if:
  • The signal handler refers to any object other than errno with static or thread storage duration that is not a lock-free atomic object, and not a non-modifiable object (for example, string literals, objects that were defined with a const-qualified type, and objects in memory that is mapped read-only), other than by assigning a value to an object declared as volatile sig_atomic_t, unless the previous modification (if any) to the object happens before the signal handler is called and the return from the signal handler happens before the next modification (if any) to the object.

  • The signal handler calls any function or function-like macro defined in this standard other than one of the functions and macros specified below as being async-signal-safe.

On Issue 8 draft 3 page 2049 line 67324 section signal(), change:
the behavior is undefined if the signal handler refers to any object [CX]other than errno[/CX] with static or thread storage duration that is not a lock-free atomic object, other than by assigning a value to an object declared as volatile sig_atomic_t, or if the signal handler calls any function defined in this standard other than [CX]one of the functions listed in Section 2.4 (on page 511)[/CX].
to:
the behavior is undefined if:
  • The signal handler refers to any object [CX]other than errno[/CX] with static or thread storage duration that is not a lock-free atomic object, [CX]and not a non-modifiable object (for example, string literals, objects that were defined with a const-qualified type, and objects in memory that is mapped read-only)[/CX], other than by assigning a value to an object declared as volatile sig_atomic_t, [CX]unless the previous modification (if any) to the object happens before the signal handler is called and the return from the signal handler happens before the next modification (if any) to the object[/CX].

  • The signal handler calls any function defined in this standard other than [CX]one of the functions listed in Section 2.4 (on page 511)[/CX].





Viewing Issue Advanced Details
739 [1003.1(2013)/Issue7+TC1] System Interfaces Editorial Clarification Requested 2013-08-23 01:11 2023-05-16 10:42
dalias
 
normal  
Applied  
Accepted As Marked  
   
Rich Felker
musl libc
strftime
2023
64612-64623
---
Note: 0006141
CX requirements for strftime seem to conflict with ISO C
The POSIX text for the %F format is:

"[CX] Equivalent to %+4[Option End]Y-%m-%d if no flag and no minimum field width are specified. [ tm_year, tm_mon, tm_mday]"

whereas the ISO C text is:

"%F is equivalent to ''%Y-%m-%d'' (the ISO 8601 date format). [tm_year, tm_mon, tm_mday]"

My reading of the ISO C text is that a conforming application could assume calling strftime with "%F" and with "%Y-%m-%d" produces identical output.

One could see this as a bug in the C standard, since %Y-%m-%d does not match ISO 8601, despite the above parenthetical remark.

This issue could be resolved by requiring (and indeed, I believe this is the only way an implementation can currently comply with both the POSIX and C requirements) that %Y behaves as %+4Y.
Either add text requiring that %Y behave as %+4Y, or forward the issue to the C committee for a decision on whether C's specification of %F is erroneous.
dr_strftime.html (3 KB) 2014-05-29 15:21
Notes
(0006141)
geoffclare   
2023-01-26 17:18   
Change:
Equivalent to %[CX]+4[/CX]Y-%m-%d if no flag and no minimum field width are specified.
to:
Equivalent to %Y-%m-%d if no flag and no minimum field width are specified. (For years between 1000 and 9999 inclusive this provides the ISO 8601:2004 complete representation, extended format date representation of a specific day.)

On 2018 edition page 2049 line 65723 section strftime() APPLICATION USAGE, after:
These two forms can be produced with the '0' flag and a minimum field width options using the conversions specifications %04Y and %01Y, respectively.
add:
Similarly, because %Y is part of %F, field widths of 10 and 7 (%010F, %07F), respectively, produce the same effect in the year portion of the %F conversion result.

On 2018 edition page 2049 line 65725 section strftime() APPLICATION USAGE, change:
For years in the range [0001,9999], POSIX.1-2017 requires that the output produced match the ISO 8601:2004 standard complete representation extended format (YYYY-MM-DD) and for years outside of this range produce output that matches the ISO 8601:2004 standard expanded representation extended format (<+/-><Underline>Y</Underline>YYYY-MM-DD).
to:
For years in the range [1000,9999], POSIX.1-2017 requires that the output produced match the ISO 8601:2004 standard complete representation extended format (YYYY-MM-DD) and for years greater than 9999 produce output that matches the ISO 8601:2004 standard expanded representation extended format (<+/-><Underline>Y</Underline>YYYY-MM-DD). For years less than 1000, %F is not required to produce an ISO 8601:2004 format when used without specifying at least a minimum field width. As stated above, some implementations pad %Y conversions with zeros to four digits, in which case %F produces an ISO 8601:2004 format; other implementations do not pad %Y with zeros, in which case %F does not produce an ISO 8601:2004 format .




Viewing Issue Advanced Details
1273 [1003.1(2016/18)/Issue7+TC2] System Interfaces Objection Error 2019-07-27 10:49 2023-10-10 09:10
stephane
 
normal  
Applied  
Accepted As Marked  
   
Stephane Chazelas
glob()
1109, 1110 (in 2018 edition)
35742, 35768
Approved
Note: 0006426
glob()'s GLOB_ERR/errfunc and non-directory files
In the XSH glob() specification,

For GLOB_ERR, the spec says:

> Cause glob() to return when it encounters a directory that it
> cannot open or read. Ordinarily, glob() continues to find
> matches.

(Note: it's not clear what "Ordinarily" means here. When errfunc
is set and returns non-zero, glob() doesn't continue, is it
ordinary?).

For errfunc:

> If, during the search, a directory is encountered that cannot
> be opened or read and errfunc is not a null pointer, glob()
> calls (*errfunc()) with two arguments.
[...]
>  2. The eerrno argument is the value of errno from the
> failure, as set by opendir(), readdir(), or stat().
> (Other values may be used to report other errors not
> explicitly documented for those functions.)

(Note: does that mean glob() has to call those 3 functions (as
opposed to open(O_DIRECTORY)/getdents() or any other API)? Why
stat(), shouldn't that be lstat()?)

First (and that's still not the case I'm making here), it's not
obvious what /directories/ glob() will try to open.

It can be somewhat inferred from the spec, as the pathname
expansion specification refers to directories that must be
readable (which implies they are going to be read) and some that
only need to be searchable (implying they're not going to be
read).

But maybe the spec should be more explicit, as it's not obvious
for instance that in */*.c the current directory and all the
subdirs are going to be read, while in */foo.c, only the
current directory is read (and all subdirs/foo.c lstat()ed), so
if there's a non-readable subdir, only the former will fail (or
cause errfunc to be invoked).

Now, to get to the point, the spec refers to "directories" that
can't be opened.

What about a /etc/passwd/*.c glob. /etc/passwd is not a
directory, opendir("/etc/passwd") if called would fail with
ENOTDIR, does that mean glob() should not call opendir() here or
that it should ignore opendir()'s error when errno is ENOTDIR?

What about */*.c where there's at least one non-directory
non-hidden file in the current directory? What if there's a
broken symlink or a symlink to a file that is not accessible
(and so for which we can't tell whether the symlink is a
directory or not)?

I've done tests with the FreeBSD 12.0, Solaris 10 and GNU libc
2.27 implementations of glob() and they all differ
significantly, the Solaris one being the least compliant to what
I can infer the spec to require, and FreeBSD's the most.

On Solaris /etc/passwd/*.c glob(GLOB_ERR) fails (and calls
errfunc with /etc/passwd, ENOTDIR), same for */*.c in a
directory that contains a non-hidden regular file.

Only FreeBSD's glob(GLOB_ERR) doesn't fail on non-existent/*.c
or */*.c in a directory that contains a broken symlink. The
other two call errfunc with ENOENT.

For */*.c in a directory that contains a symlink to a
non-accessible area, they all fail (call errfunc with EACCESS).
Same with */*/*.c if the current directory contains a subdir
that is readable but not searchable (note that whether glob()
could tell whether entries of that directory are directories or
not depends on whether readdir() returns that information or
not; either way, we can't tell for symlinks).
At this point, I just want to start the discussion as to how
best fix it.

- The "ordinarily" should probably be changed to "if errfunc is
  NULL"

- I don't think we want to force implementations to literally
  call opendir()/readdir()/lstat() (in any case, that "stat()"
  is wrong). Not sure how to phrase it though.

- we should probably clarify which directories glob() is meant
  to try opening, or which files glob() is meant to invoke
  opendir() or equivalent on.

- and then what to do for non-directories or files which we
  can't tell whether they're directories or not. Either require
  the FreeBSD or GNU behaviour or allow both. The Solaris
  behaviour is not useful IMO, but it's more flexible in that
  the caller can use a errfunc that ignores ENOENT/ENOTDIR to
  emulate the GNU/FreeBSD behaviour.
Notes
(0006426)
geoffclare   
2023-08-10 15:30   
Interpretation response
------------------------
The standard is unclear on this issue, and no conformance distinction can be made between alternative implementations based on this. This is being referred to the sponsor.

Rationale:
-------------
None.

Notes to the Editor (not part of this interpretation):
-------------------------------------------------------
Page and line numbers are for Issue 8 draft 3.

On page 1201 line 41070 section glob() (GLOB_ERR), change:
Cause glob() to return when an attempt to open, read, or search a directory fails because of an error condition that is related to file system contents. If this flag is not set, glob() shall not treat such conditions as an error, and shall continue to look for matches. Other error conditions may also be treated the same way as error conditions that are related to file system contents.
to:
Cause glob() to return when an attempt to open or search a pathname as a directory, or an attempt to read an opened directory, fails because of an error condition that is related to file system contents and prevents glob() from expanding the pattern. If this flag is not set, glob() shall not treat such conditions as an error, and shall continue to look for matches. Other error conditions may also be treated the same way as error conditions that are related to file system contents.


On Issue 8 draft 3 page 1202 line 41114 section glob(), change:
If, during the search, an attempt to open, read, or search a directory fails and errfunc is not a null pointer, ...
to:
If errfunc is not a null pointer and, during the search, an attempt to open or search a pathname as a directory, or an attempt to read an opened directory, fails because of an error condition that prevents glob() from expanding the pattern, ...


After page 1203 line 41124 section glob(), add a paragraph:
The set of error conditions that are considered to prevent glob() from expanding the pattern shall include [EACCES], [ENAMETOOLONG], and [ELOOP]. It is implementation-defined what other error conditions are included in the set.


After page 1204 line 41202 section glob() (RATIONALE), add:
Implementations differ as to which error conditions they consider prevent glob() from expanding the pattern. The standard requires that [EACCES], [ENAMETOOLONG], and [ELOOP] are included because in these cases the expansion could succeed if performed with a different effective user or group ID, or with an alternative pathname (shorter than {PATH_MAX}, or traversing fewer symbolic links).
    
Implementations are encouraged to call (*errfunc()) for all error conditions that are related to file system contents which occur when attempting to open or search a pathname as a directory or attempting to read an opened directory. The appropriate way to handle such errors varies according to the provenance of the pattern and what the application will do with the resulting pathnames, and should therefore be for the application to decide. For example, given the pattern "non-existing/*", some applications may want glob() to succeed and return an empty list because there are no existing files that match the pattern, but for others that would not be appropriate, such as if an application asks the user to name a directory containing files to be processed and the user makes a typing mistake when responding; the application will want to alert the user to the mistake instead of behaving as if the user had named an empty directory. If (*errfunc()) is called for [ENOENT] errors, the first application can ignore them in that function, but if (*errfunc()) is not called, the second application cannot achieve what it wants using glob(). Similar reasoning applies for the pattern "regular_file/*" and [ENOTDIR] errors.


On page 1205 line 41217 section glob(), change FUTURE DIRECTIONS from "None" to:
A future version of this standard may require that (*errfunc()) is called for all error conditions that are related to file system contents which occur when attempting to open or search a pathname as a directory or attempting to read an opened directory.


On page 2508 line 81856 section 2.14.3, change:
If these permissions are denied, or if an attempt to open or search a directory fails because of an error condition that is related to file system contents, this shall not be considered an error and pathname expansion shall continue as if the directory had existed and had been successfully opened or searched, and no matching directory entries had been found in it.
to:
If these permissions are denied, or if an attempt to open or search a pathname as a directory, or an attempt to read an opened directory, fails because of an error condition that is related to file system contents, this shall not be considered an error and pathname expansion shall continue as if the pathname had named an existing directory which had been successfully opened and read, or searched, and no matching directory entries had been found in it.




Viewing Issue Advanced Details
1406 [1003.1(2016/18)/Issue7+TC2] System Interfaces Editorial Clarification Requested 2020-09-28 21:26 2023-10-10 09:16
djdelorie
 
normal  
Applied  
Accepted As Marked  
   
DJ Delorie
Red Hat Inc
open_memstream
https://pubs.opengroup.org/onlinepubs/9699919799/functions/open_memstream.html [^]
n/a
---
See Note: 0006489.
clarification of SEEK_END when current pointer doesn't match buffer size
Consider a stream created by open_memstream(), where 16 bytes are written, fseek(0,SEEK_POS) to rewind, then write 4 bytes, and fflush(). At this point, the value pointed to by the sizep argument to open_memstream() should be 4 (please confirm).
At this point in the state of the stream, what are the semantics of SEEK_END? What will be the "file size" if you fclose() at this point?
The example explicitly SEEK_SETs to the buffer size before fclose(), eliding the issue.
Please clarify if SEEK_END is relative to the current position or the current buffer length, and if it's changed by a call to fflush() at that time.
Please clarify if a SEEK_SET to set the current pointer less than the current buffer size, itself (without read/write), changes the SEEK_END semantics, or the value stored in *sizep after fflush().
Notes
(0006489)
Don Cragun   
2023-09-25 16:13   
New text that allows both behaviours, as an Issue 8 compromise:
Add a new paragraph after issue8 draft 3 P1617. L50906 (in open_memstream())
The fseek() and fseeko() functions can be used to set the file position beyond the current buffer length. It is implementation-defined whether this extends the buffer to the new length. If it extends the buffer, the added buffer contents shall be set to null bytes for open_memstream(), or null wide characters for open_wmemstream(); if it does not extend the buffer, then if data is later written at this point, the buffer contents in the gap shall be set to null bytes for open_memstream(), or null wide characters for open_wmemstream(). If fseek() or fseeko() is called with SEEK_END as the whence argument, it is implementation-defined whether the file position shall be adjusted either relative to the current buffer length or relative to the buffer size that would be set by an fflush() call made immediately before the fseek() or fseeko() call.




Viewing Issue Advanced Details
1614 [1003.1(2016/18)/Issue7+TC2] System Interfaces Objection Omission 2022-11-03 13:36 2023-09-04 10:28
kre
 
normal  
Applied  
Accepted As Marked  
   
Robert Elz
XSH 3/mktime
1331
44331-44332
Approved
See Note: 0006402.
XSH 3/mktime does not specify EINVAL and should
First, it is possible this is a duplicate bug report, if this issue
has already (or is to be already) handled by some other bug, and this
report doesn't add anything new, then please simply link this to that
as a duplicate, and then close this.

It is possible for the broken down local time passed to mktime() to
represent a time that never existed (eg: a time in the gap (most commonly
hour) when summer time commences, and the clock skips forward from
NN:59 to PP:00 (usually, gaps shorter than an hour exist, but for our
purposes here, as an example, that's not important) where PP==(NN+2)%24
and times MM:xx (usually for all xx) where MM==(NN+1)%24 simply never
existed (with appropriate adjustments to tm_mday, etc, if the %24 makes
any difference to the answer in each case).

If one specifies an input time of MM:20 that simply cannot be converted
to a time_t, it kind of represents NN:59 + 21 mionutes, except that local
time is really PP:20 not MM:20.

Note that tm_isdst does not help here, as the time simply does not exist,
it cannot be either summer or standard (or standard or winter as appropriate)
time. It is possible for an implementation to use tm_isdst as a hint
to what happened so cases like this work

    timeptr = localtime(&t);
    timeptr->tm_min += 20;
    t = mktime(timeptr);

where if tm_isdst should have changed value to represent the new time
calculated, but clearly here does not (and since timeptr came from
localtime, we know tm_isdst is 0 or 1) then the implementation might be
able to guess what was intended.

That does not always work however, there can be gaps in times for reasons
other than the tm_isdst changing seasonal adjustment, such as when a locality
decides to switch from one zone offset to another (eg: sometime, I am too
lazy to look up when, Singapore and Malaysia switched from UTC+0700 to
UTC+0800 (to align their clocks with Hong Kong, which was apparently
considered important - at least for Singapore). Neither used summer time,
before or after than change, tm_isdst is 0 in both zones - but there was
an hour there that never existed.

Similarly, when seasonal time ends, and time jumps backwards, there is an
hour (most commonly) of local time when the time runs twice. If one specifies
a time which is in (one of) those periods, along with a tm_isdst = -1, then
it is impossible to determine which time_t value should apply - EINVAL is
returned in that case.

Note that here, tm_isdst is used to handle this overlap when it is caused
by seasonal adjustments - but just as with the gap, that only works when
the duplicated time are for that reason, if Malaysia decided (which would be
odd, indeed, but could happen) that having their clocks match Thailand,
much of Indonesia (including Jakarta), and Laos and Cambodia, rather than
Singapore, they could decide to jump back to UTC+0700 by running one hour
twice - with tm_isdst==0 in both occurrences.

There is another way that (since bug 1533 was applied) could be considered
reasonable to handle the ambiguous case - one could use the value of tm_gmtoff
to determine which of the possible time_t's to return. This even handles
(never seen that I am aware of, and very unlikely to ever happen) cases
where a time runs more than twice, which tm_isdst cannot do.

It appears very tempting to make use of that to resolve this problem, but
I would strongly advise against it. Doing so would break current conforming
applications (which simple addition of the fields does not) as they currently
do not, and cannot (since the tm_gmtoff field does not appear in the existing
standard - it is not even as of the date of this report in a published draft
of an upcoming version of the standard) set that field, its value from such
an application will be unitialised (or perhaps 0), if mktime() were to attempt
to reference it, undefined behaviour might result. That's unacceptable.

It is not even OK to permit implementations to use tm_gmtoff by making its
use for this purpose unspecified, for the same reason - any implementation
that does risks undefined behaviour from a currently conforming application.
mktime() must not be permitted to reference that field (in the incoming
structure) at all.


An error code is required to handle this invalid or ambiguous input.
EINVAL is the usual one.
Between lines 44331 and 44332 (of I7 TC2) add:

    [EINVAL] The local time specified represents a local time which
               either did not exist in the past, or is not expected to exist
               in the future, or the local time specified represents a
               local time which existed, or is expected to exist in the
               future, more than once, and the value supplied in tm_isdst
               is unable to resolve the ambiguity.

It might also be (assuming that you all agree with my reasoning above)
a good idea to add something in the Rationale (line 44359, currently "None")
explaining why tm_gmtoff is not (at least now) considered appropriate to
use to resolve the ambiguous case. I will leave it up to you to determine
whether it would be worth adding a Future Direction indicating that that
might be changed at some future time (ie: advising applications to ensure
that they start setting tm_gmtoff before calling mktime() - it will currently
be ignored, but one day, might be essential in this odd case).

           
Notes
(0006402)
geoffclare   
2023-07-25 14:08   
(edited on: 2023-07-25 14:16)
The following is a suggested resolution which just covers the originally raised issue of the allowed behaviour for "non-existent" times and the function's error handling.

The side-issue of how the conversion from broken-down time to time since the Epoch is performed, particularly with respect to out-of-range struct tm member values, should be addressed in bug 0001627.

Interpretation response
------------------------

The standard clearly states that when an unsuccessful call to mktime() returns (time_t)-1 it sets errno to [EOVERFLOW], and conforming implementations must conform to this.

Rationale:
-------------

The RETURN VALUE section on the mktime() page states:
If the time since the Epoch cannot be represented, the function shall return the value (time_t)-1 [CX]and set errno to indicate the error[/CX].
This requires that errno is set to indicate "the error", and the beginning of the sentence states the nature of the error condition to which "the error" refers: the time since the Epoch (i.e. the integer value to be returned) cannot be represented. The ERRORS section requires that the error number [EOVERFLOW] is used for this condition.

Thus the standard requires that errno is set to [EOVERFLOW] when an unsuccessful call to mktime() returns (time_t)-1 and an implementation that sets it to [EINVAL] does not conform.

The mktime() function does not have any way to indicate to the caller that an error other than [EOVERFLOW] occurred.

Notes to the Editor (not part of this interpretation):
-------------------------------------------------------

On page 425 line 14451 section <time.h>, after applying bug 1253 change:
The value of tm_isdst shall be positive if Daylight Saving Time is in effect, 0 if Daylight Saving Time is not in effect, and negative if the information is not available.
to:
When tm_isdst is set by an interface defined in this standard, its value shall be positive if Daylight Saving Time (DST) is in effect and 0 if DST is not in effect. [CX]It shall not be set to a negative value by any interface defined in this standard. When tm_isdst is passed to the mktime() function, it specifies how mktime() is to handle DST when calculating the time since the Epoch value; see [xref to mktime()].[/CX]

On page 1331 line 44310 section mktime(), delete:
A positive or 0 value for tm_isdst shall cause mktime() to presume initially that Daylight Savings Time, respectively, is or is not in effect for the specified time. A negative value for tm_isdst shall cause mktime() to attempt to determine whether Daylight Savings Time is in effect for the specified time.

On page 1331 line 44317 section mktime(), change:
corrected for timezone and any seasonal time adjustments
to:
corrected for the offset of the timezone's standard time from Coordinated Universal Time and further corrected (if applicable--see below) for Daylight Saving Time

After page 1331 line 44321 section mktime(), add these new paragraphs:
[CX]If the timezone is one that includes Daylight Saving Time (DST) adjustments, the value of tm_isdst in the tm structure controls whether or not mktime() adjusts the calculated seconds since the Epoch value by the DST offset (after it has made the timezone adjustment), as follows:
  • If tm_isdst is zero, mktime() shall not further adjust the seconds since the Epoch by the DST offset.

  • If tm_isdst is positive, mktime() shall further adjust the seconds since the Epoch by the DST offset.

  • If tm_isdst is negative, mktime() shall attempt to determine whether DST is in effect for the specified time; if it determines that DST is in effect it shall produce the same result as an equivalent call with a positive tm_isdst value, otherwise it shall produce the same result as an equivalent call with a tm_isdst value of zero. If the broken-down time specifies a time that is either skipped over or repeated when a transition to or from DST occurs, it is unspecified whether mktime() produces the same result as an equivalent call with a positive tm_isdst value or as an equivalent call with a tm_isdst value of zero.


If the TZ environment variable specifies a geographical timezone for which the implementation's timezone database includes historical or future changes to the offset from Coordinated Universal Time of the timezone's standard time, and the broken-down time corresponds to a time that was (or will be) skipped over or repeated due to the occurrence of such a change, mktime() shall calculate the time since the Epoch value using either the offset in effect before the change or the offset in effect after the change.[/CX]

On page 1331 line 44323 section mktime(), after applying bug 1613 change:
with the specified time since the Epoch as its argument
to:
with the calculated time since the Epoch as its argument

On page 1331 line 44327 section mktime(), change:
The mktime() function shall return the specified time since the Epoch encoded as a value of type time_t. If the time since the Epoch cannot be represented, the function shall return the value (time_t)-1 [CX]and set errno to indicate the error[/CX].
to:
The mktime() function shall return the calculated time since the Epoch encoded as a value of type time_t. If the time since the Epoch cannot be represented as a time_t [CX]or the value to be returned in the tm_year member of the structure pointed to by timeptr cannot be represented as an int[/CX], the function shall return the value (time_t)-1 [CX]and set errno to [EOVERFLOW], and shall not change the value of the tm_wday component of the structure.[/CX]

[CX]Since (time_t)-1 is a valid return value for a successful call to mktime(), an application wishing to check for error situations should set tm_wday to a value less than 0 or greater than 6 before calling mktime(). On return, if tm_wday has not changed an error has occurred.[/CX]

On page 1332 line 44348 section mktime(), change:
if (mktime(&time_str) == -1)
to:
time_str.tm_wday = -1;
if (mktime(&time_str) == (time_t)-1 && time_str.tm_wday == -1)

On page 1332 line 44359 section mktime(), change RATIONALE from "None" to:
In order to allow applications to distinguish between a successful return of (time_t)-1 and an [EOVERFLOW] error, mktime() is required not to change tm_wday on error. This mechanism is used rather than the convention used for other functions whereby the application sets errno to zero before the call and the call does not change errno on error because the ISO C standard does not require mktime() to set errno on error. The next revision of the ISO C standard is expected to require that mktime() does not change tm_wday when returning (time_t)-1 to indicate an error, and that this return convention is used both for the case where the value to be returned by the function cannot be represented as a time_t and the case where the value to be returned in the tm_year member of the tm structure cannot be represented as an int.

The DESCRIPTION section says that mktime() converts the specified broken-down time into a time since the Epoch value. The use of the indefinite article here is necessary because, when tm_isdst is negative and the timezone has Daylight Saving Time transitions, there is not a one-to-one correspondence between broken-down times and time since the Epoch values.

The description of how the value of tm_isdst affects the behavior of mktime() is shaded CX because the requirements in the ISO C standard are unclear. The next revision of the ISO C standard is expected to state the requirements using wording equivalent to the wording in this standard.






Viewing Issue Advanced Details
1621 [1003.1(2016/18)/Issue7+TC2] Base Definitions and Headers Editorial Clarification Requested 2022-12-02 17:30 2023-05-16 11:14
steffen
 
normal  
Applied  
Accepted As Marked  
   
steffen
Definitions (3.250 + 1), System Interfaces (getdelim())
73, 1026
2121 + 1, 34991
---
See Note: 0006104.
Add "null terminator" definition, adjust getdelim(3) usage accordingly
The standard contains usages of "null terminator", documenting only "null byte" terminators. It uses "NUL terminator" at one place (added in Issue 7 for getdelim() said Geoff Clare on austin-group-l@ in Y4nXMYB9tFwxeYx5@localhost).

I mention i personally only ever used NUL (the "character" name in ISO 646 / LATIN1 / Unicode) for any such purpose, but the standard defines C-style "3.375 String"s indeed as "A contiguous sequence of bytes terminated by and including the first null byte".

So here the clarification request.
On page 73, insert after line 2121

 3.251 Null terminator
 A term used for the Null Byte of Section 3.248 when used as a terminator for String of Section 3.375 (on page REF).

On page 1026, for getdelim(3), change on line 34991

  Although a NUL terminator is always supplied after the line[.]

to

  Although a null terminator is always supplied after the line[.]
Notes
(0006104)
Don Cragun   
2023-01-09 17:16   
On page 73, insert after line 2121:
3.251 Null Terminator
 
A term used for the null byte when used as a terminator for a string.


On page 1026, for getdelim(3), change on line 34991:
Although a NUL terminator is always supplied after the line...
to:
Although a null terminator is always supplied after the line...




Viewing Issue Advanced Details
1625 [1003.1(2016/18)/Issue7+TC2] Base Definitions and Headers Objection Omission 2023-01-04 02:18 2023-05-16 10:52
philip-guenther
 
normal  
Applied  
Accepted  
   
Philip Guenther
OpenBSD
waitid
waitid
111
3066
---
waitid should be marked as aync-signal-safe and a memory-synchronization point
(I only have issue 7 on hand, so section and line references will be off)

I believe that, historically, waitid() has been implemented in a manner similar to waitpid(), with an underlying system call and synchronization inside the kernel. Because of that, it has had the same sort of intrinsic behaviors described for wait() and waitpid() in XBD 4.12, "Memory Synchronization" and in XSH 2.4.3's for async-signal-safe functions. However, when it was added to the standard it was not included in those lists.
Add waitid() to the lists of both the memory synchronization and async-signal-safe functions.
Notes




Viewing Issue Advanced Details
1627 [1003.1(2016/18)/Issue7+TC2] System Interfaces Objection Enhancement Request 2023-01-05 12:17 2023-09-04 10:34
kre
 
normal  
Applied  
Accepted As Marked  
   
Robert Elz
XSH 3/mktime
1331
44310-44332, 44361
---
Note: 0006415
XSH 3 / mktime() is woefully underspecified
Following on from notes added to bug:1614 and a lengthy
mailing list discussion, it is evident that the specification
of XSH/mktime is woefully inadequate.

New text is specified in the Desired Action to remedy those defects.

This is currently missing anything dealing with what should be done
if the input tm_isdst is not < 0, and does not agree (in sign, if 0
can be said to have a sign) with the final value for tm_isdst in the
struct tm on a successful return.

That's because my inclination is to simply do nothing in that case,
return the correct tm_isdst, but otherwise ignore it - but I admit that's
not how the implementations behave, and that may be being depended upon
by some applications (though the current behaviour is definitely not
required by any standard). So I will leave it for someone who cares
about that to add suitable text to (properly) specify what is to happen.

Also, given that it is too late now to consider adding a timegm()
function (an analog to mktime() which has existed for decades, but
never been standardised) I thought what might be possible would be
to specify enough in the FUTURE DIRECTIONS here to indicate that that
will happen (since it is being added to the C standard, it will happen,
eventually) and to indicate why using it is a much better idea when
the purpose is time_t arithmetic than using localtime()/mktime().
The intent is to get applications to start writing safe code, rather
than nonsense, and do that asap - since in practice, timegm() is
already widely available.

As usual, formatting and wording changes, which keep to the general
intent expressed below are welcome. One thing I considered, but
wasn't sure of a good way to handle, was to find some shorter way
to express "the converted value of the struct tm referred to by
timeptr" (or a field thereof) - which occurs far too often in the
text below, and is unwieldy.
Delete the (CX shaded) paragraph that starts (line 44310)

        A positive or 0 value for tm_isdst ...
and ends (line 44313)
        ... is in effect for the specified time.

Replace the (CX) shaded paragraph that starts (line 44315)

        The relationship between the tm structure ...
and ends(line 44321)
        ... the other tm structure members specified in <time.h>
        (excluding tm_wday).

with the following (presumably also CX shaded) paragraphs:

        The mktime() function will first convert the tm_sec, tm_min,
        tm_hour, tm_mon, tm_mday and tm_mon (again) fields of the tm
        structure referenced by timeptr (or a local internal copy thereof),
        in that order, so that their values become within the ranges specified
        by <time.h>, but also within the ranges applicable to a Gregorian
        Calendar date (tm_sec shall not be more than 59, and tm_mday shall
        not be more than the number of days in the month specified by
        the tm_mon field of the year specified by the tm_year field).

        If _field_ represents the field being converted, and _next_
        represents the field specified immediately after it in <time.h>
        then this shall be done, for each field, by an algorithm equivalent
        to:

                if (timeptr->_field_ < MINIMUM_VALUE) {
                        while (timeptr->_field_ < MINIMUM_VALUE) {
                                timeptr->_field_ += RANGE;
                                timeptr->_next_--; /* check overflow */
                        }
                } else if (timeptr->_field_ > MAXIMUM_VALUE) {
                        while (timeptr->_field_ > MAXIMUM_VALUE) {
                                timeptr->_field_ -= RANGE;
                                timeptr->_next_++; /* check overflow */
                        }
                } /* else do nothing, value of _field_ is OK */

        where MINIMUM_VALUE is the minimum allowed value for the
        field _field_ as specified in <time.h> MAXIMUM_VALUE is the
        maximum allowed value for the field _field_ as specified in
        <time.h> except that it shall be 59 where _field_ is tm_sec,
        and shall be the appropriate number of days in the specific
        month selected by the tm_mon and tm_year fields, where _field_
        is tm_mday, and thus is subject to change during each iteration
        of the loop, and RANGE is (MAXIMUM_VALUE - MINIMUM_VALUE + 1)
        (which is also subject to change, in each iteration of both loops
        above, where the field is tm_mday).

        Note that there is no requirement that the actual structure
        passed via *timeptr be the one being modified by this code.

        Should overflow (absolute value of the field becomes too large
        to be represented in an int) occur at the places indicated,
        the implementation shall return an error if the _next_ field is
        tm_year, and may return an error for other fields, though if
        _next_ is not tm_year, it may adjust the value of any later field,
        and reduce the magnitude of the _next_ field by an appropriate
        amount to compensate. Adjustments made this way should be chosen
        so as to minimise the effects of the adjustment upon the meaning
        of the later field, for example, if tm_hour were to overflow,
        the implementation might adjust tm_mday by 146101 (the number of
        days in a 400 year period - since in the Gregorian calendar, that is
        a constant) and reduce the magnitude of tm_hour by 3506424 (24*146101,
        the number of hours in 400 years). Alternatively it might alter
        tm_mon by 4800 (the number of months in a 400 year period), and
        adjust tm_hour by the same amount (3506424). Overflow produced
        when making any such adjustment should be handled in a similar
        way, including, if an adjustment to tm_mon requires an adjustment
        to tm_year, and that causes tm_year to overflow, then an error
        shall be returned.

        The tm_isdst field of the structure referred to by timeptr (or
        a local copy thereof) shall be converted by altering any
        value that is less than 0 to be -1, and any value that is
        greater than 0 to be 1. If supplied as 0, no change shall
        be made.

        Once all fields are within the appropriate ranges, the
        implementation shall determine if there is a unique value
        of the type returned by time() (which is expressed as a value
        in Coordinated Universal Time) which when converted to a
        struct tm by a function equivalent to localtime() would
        produce identical values for the tm_sec tm_min tm_hour tm_mday
        tm_mon and tm_year fields of the converted input struct tm.
        This may be accomplished by applying a formula, similar to
        that specified for Coordinated Universal Time in <xref XBD 4.17>
        adjusted to account for local timezone offsets, and time
        alterations, or by any other means.

        If such a unique result is found, then that shall be the
        result from mktime().

        If no result is found because the tm structure represents
        a value outside the range of values that can be represented
        by a value returned by time(), then an error shall be returned.

        Otherwise if no result is able to be found, then the local time
        specified represents a time which does not exist as a local time
        value. In this case, if the value of tm_isdst in the struct tm
        specified by timeptr is greater than or equal to 0, and there
        are two values or the type returned by time(), representing times
        that are one second apart, (t1 and t2, where t2 == t1 + 1 second)
        which can be found of the type returned by time(), such that
        one of those, when converted by a function equivalent to localtime()
        returns a time which occurs before the converted time referred to
        by timeptr, and the other returns a time which occurs later, and
        also one of those would produce a struct tm with tm_isdst == 0,
        and the other when converted by localtime would produce a struct tm
        with tm_isdst == 1, then if the application's converted tm_isdst
        field the same as that produced by t1, then the implementation
        shall calculate the difference, in seconds, between the converted
        time specified by timeptr, and that produced by a conversion of t1,
        add the number of seconds to t1, and that shall be the result of
        mktime. Otherwise, if the applications converted tm_isdst is
        the same as that produced by t2, the implementation shall
        calculate the difference (in seconds) between the struct tm
        produced by t2, and that specified by the converted struct tm
        referred to by timeptr, and subtract that number of seconds from
        t2, and that shall be the result from mktime(). In any other
        case the result is unspecified. The implementation may
        arbitrarily return one of the results as if it had been one of
        the two specified cases, or may return an error.

        If more than one possible result is found, then if there are
        exactly two possible results, and one of those, when converted by
        a function equivalent to localtime(), produces a value with tm_isdst
        having the same value as the converted value of that field in the
        struct tm referred to by timeptr, and the other does not, then
        the result of mktime() shall be the single unique result which
        produces a struct tm with the same tm_sec tm_min tm_hour tm_mday
        tm_mon tm_year and tm_isdst fields as the converted values in the
        struct tm referred to by timeptr. In any other case, the result
        is unspecified. The implementation may arbitrarily return any
        of the plausible ambiguous results, or may return an error.

This should then be followed by the new (bug 1613 inserted) text about
what happens to the struct tm in the case of a successful return. This
I believe has already replaced the "Upon successful completion, the values
of the tm_wday..." paragraph. If not, delete whatever is left of it.

A new paragraph (or just sentence perhaps) should be added after the 1613
inserted paragraph:

        When mktime() returns an error, the contents of the structure
        referred to by timeptr, after mktime() returns, shall be unspecified.

The RETURN VALUE section (lines 44327-9) should be replaced by:

        The mktime() function shall return the calculated time since the
        epoch, as specified in the DESCRIPTION, encoded as a value of
        type time_t. If an error is to be returned, then the function
        shall return the value (time_t)-1, and set errno to indicate the
        error.

The ERRORS section (lines 44331-2) should be replaced by

        The mktime() function shall fail if:

        [EOVERFLOW] The value of the time returned by time() which
                        represents the converted struct tm passed by
                        timeptr falls outside the range of values that
                        can be represented as a time_t.

        [EOVERFLOW] While correcting the values of the fields of the
                        struct tm referred to by timeptr to be within the
                        required ranges, a required adjustment of the tm_year
                        field caused that field to overflow.

        The mktime() function may fail if:

        [EOVERFLOW] Adjusting a field of the struct tm referred to
                        by timeptr caused an adjustment to be required to
                        another field, and that adjustment caused that other
                        field to overflow.

        [EINVAL] The converted struct tm referred to by timeptr
                        cannot be represented by a unique number of seconds
                        past the epoch, Coordinated Universal Time, and
                        the input values, and/or circumstances are not such
                        that an alternative is required to be selected.

In the FUTURE DIRECTIONS section (line 44361) replace "None." by

        A later edition of the standard is expected to add a timegm()
        function that is similar to mktime(), except that the struct tm
        referred to by timeptr represents a calendar time in Coordinated
        Universal Time (rather than the local time zone), where references
        to localtime() are replaced by references to gmtime(), and where
        there are no zone offset adjustments, or missing or ambiguous times,
        tm_isdst is always 0, and EINVAL cannot be returned. A combination
        of gmtime() and timegm() will be the expected way to perform
        arithmetic upon a time_t value and remain compatible with the C
        standard (where the internal structure of a time_t is not specified).
        Attempting such manipulations using localtime() and mktime() can lead
        to unexpected results.
Notes
(0006415)
geoffclare   
2023-08-07 09:05   
(edited on: 2023-08-07 09:06)
Suggested resolution ...

On page 113 line 3186 section 4.16 Seconds Since the Epoch, change:
The relationship between the actual time of day and the current value for seconds since the Epoch is unspecified.
to:
The relationship between the actual date and time in Coordinated Universal Time, as determined by the International Earth Rotation Service, and the system's current value for seconds since the Epoch is unspecified.

On page 1331 line 44320 section mktime(), after applying bugs 1613 and 1614 change:
The relationship between the tm structure (defined in the <time.h> header) and the time in seconds since the Epoch is that the result shall be as specified in the expression given in the definition of seconds since the Epoch (see [xref to XBD 4.19]) corrected for the offset of the timezone's standard time from Coordinated Universal Time and further corrected (if applicable--see below) for Daylight Saving Time, where the names other than tm_yday in the structure and in the expression correspond, and the tm_yday value used in the expression is the day of the year from 0 to 365 inclusive, calculated from the members of the tm structure specified above.
to:
The mktime() function shall calculate the time in seconds since the Epoch to be returned as if by manipulating the members of the tm structure according to the following steps.

1. The tm_sec member may, but should not, be brought into the range 0 to 60, inclusive. For each 60 seconds added to or subtracted from tm_sec, a decrement or increment, respectively, of 1 minute shall be saved for later application.

2. The tm_min member shall be brought into the range 0 to 59, inclusive, and any saved decrement or increment of minutes shall then be applied, repeating the range adjustment afterwards if necessary. For each 60 minutes added to or subtracted from tm_min, a decrement or increment, respectively, of 1 hour shall be saved for later application.

3. The tm_hour member shall be brought into the range 0 to 23, inclusive, and any saved decrement or increment of hours shall then be applied, repeating the range adjustment afterwards if necessary. For each 24 hours added to or subtracted from tm_hour, a decrement or increment, respectively, of 1 day shall be saved for later application.

4. The tm_mon member shall be brought into the range 0 to 11, inclusive. For each 12 months added to or subtracted from tm_mon, a decrement or increment, respectively, of 1 year shall be saved for later use.

5. The tm_mday member shall be brought into the range 1 to 31, inclusive, and any saved decrement or increment of days shall then be applied, repeating the range adjustment afterwards if necessary. Adjustments downwards shall be applied by subtracting the number of days (according to the Gregorian calendar) in month tm_mon+1 of the year obtained by adding/subtracting any saved increment/decrement of years to the value tm_year+1900, and then incrementing tm_mon by 1, repeated as necessary. Adjustments upwards shall be applied by adding the number of days in the month before month tm_mon+1 of the year obtained by adding/subtracting any saved increment/decrement of years to the value tm_year+1900, and then decrementing tm_mon by 1, repeated as necessary. During these adjustments, the tm_mon value shall be kept within the range 0 to 11, inclusive, by applying step 4 as necessary.

6. If the tm_mday member is greater than the number of days in month tm_mon+1 of the year obtained by adding/subtracting any saved increment/decrement of years to the value tm_year+1900, that number of days shall be subtracted from tm_mday, and tm_mon shall be incremented by 1. If this results in tm_mon having the value 12, step 4 shall be applied.

7. The number of seconds since the Epoch in Coordinated Universal Time shall be calculated from the range-corrected values of the relevant tm structure members (or the original value where a member was not range corrected) as specified in the expression given in the definition of seconds since the Epoch (see [xref to XBD 4.19]), where the names other than tm_year and tm_yday in the structure and in the expression correspond, the tm_year value used in the expression is the tm_year in the structure plus/minus any saved increment/decrement of years, and the tm_yday value used in the expression is the day of the year from 0 to 365 inclusive, calculated from the tm_mon and tm_mday members of the tm structure, for that year.

8. The time since the Epoch shall be corrected for the offset of the local timezone's standard time from Coordinated Universal Time.

9. The time since the Epoch shall be further corrected (if applicable--see below) for Daylight Saving Time.

On page 1332 line 44357 section mktime(), change APPLICATION USAGE from "None" to:
When using mktime() to add or subtract a fixed time period (one that always corresponds to a fixed number of seconds) to or from a broken-down time in the local timezone, reliable results for arbitrary TZ can only be assured by using mktime() to convert the original broken-down time to a time since the Epoch, adding or subtracting the desired number of seconds to that value, and then calling localtime() with the result. The alternative of adjusting the broken-down time before calling mktime() may produce unexpected results if the original and updated times are on different sides of a geographical timezone change. On implementations that follow the recommendation of not range-correcting tm_sec (see step 1 in the DESCRIPTION), reliable results can also be assured by adding or subtracting the desired number of seconds to tm_sec (and not modifying any other members of the tm structure). In applications needing to be portable to non-POSIX systems where the time_t encoding is not a count of seconds, it is recommended that conditional compilation is used such that the adjustment is performed on the mktime() return value when possible, and otherwise on the tm_sec member. For timezones that are known not to have geographical timezone changes, such as <tt>TZ=UTC0</tt>, adjustments using just mktime() do not have this problem.

The way the mktime() function interprets out-of-range tm structure fields might not produce the expected result when multiple adjustments are made at the same time. For example, if an application tries to go back one day first and then one year by calling localtime(), decrementing tm_mday and tm_year, and then calling mktime() this would not produce the expected result if it was called on 2021-03-01 because mktime() would see the supplied year as 2020 (a leap year) and correct Mar 0 to Feb 29, whereas the intended result was Feb 28. Such issues can be avoided by doing multiple adjustments one at a time when the order in which they are done matters.

Examples of how mktime() handles some adjustments are:
  • If given Feb 29 in a non-leap year it treats that as the day after Feb 28 and gives back Mar 1.

  • If given Feb 0 it treats that as the day before Feb 1 and gives back Jan 31.

  • If given 21:65 it treats that as 6 minutes after 21:59 and gives back 22:05.

  • If given tm_isdst=0 for a time when DST is in effect, it gives back a positive tm_isdst and alters the other fields appropriately.

  • If there is a DST transition where 02:00 standard time becomes 03:00 DST and mktime() is given 02:30 (with negative tm_isdst), it treats that as either 30 minutes after 02:00 standard time or 30 minutes before 03:00 DST and gives back a zero or positive tm_isdst, respectively, with the tm_hour field altered appropriately.

  • If a geographical timezone changes its UTC offset such that ``old 00:00'' becomes ``new 00:30'' and mktime() is given 00:20, it treats that as either 20 minutes after ``old 00:00'' or 10 minutes before ``new 00:30'', and gives back appropriately altered struct tm fields.

If an application wants to check whether a given broken-down time is one that is skipped over, it can do so by seeing whether the tm_mday, tm_hour, and tm_min values it gets back from mktime() are the same ones it fed in. Just checking tm_hour and tm_min might appear at first sight to suffice, but tm_mday could also change--without tm_hour and tm_min changing--if, for example, TZ is set to "ABC12XYZ-12" (which might be used in a torture test) or if a geographical timezone changes the offset from Coordinated Universal Time of its standard time by 24 hours.

On page 1332 line 44359 section mktime(), change RATIONALE from "None" to:
Implementations are encouraged not to range-correct tm_sec (see step 1 in the DESCRIPTION) in order for the results of making an adjustment to tm_sec always to be equivalent to making the same adjustment to the value returned by mktime(), even when the original and updated times are on different sides of a geographical timezone change. This provides a way for applications to do reliable fixed-period adjustment using only mktime(), as described in APPLICATION USAGE.

The described method for range-correcting the tm structure members uses separate variables to hold adjustment values to be applied later to other members, or (for the year adjustment) used in later calculations, because this is one way of avoiding intermediate member values that are not representable as an int. Implementations may use other methods; all that is required is that tm_year is the only member for which an [EOVERFLOW] error can occur.

The described method for range-correcting tm_mday would, if implemented that way, be highly inefficient for very large values. The efficiency can be improved by observing that any period of 400 years always has the same number of days, so the month-by-month correction method need only be applied for a maximum of 4800 months.

On page 1332 line 44361 section mktime(), change FUTURE DIRECTIONS from "None" to:
A future version of this standard may require that mktime() does not perform the optional range correction of the tm_sec member of the tm structure described at step 1 in the DESCRIPTION.

A future version of this standard is expected to add a timegm() function that is similar to mktime(), except that the tm structure pointed to by timeptr contains a broken-down time in Coordinated Universal Time (rather than the local timezone), where references to localtime() are replaced by references to gmtime(), and where there are no timezone offset or Daylight Saving Time adjustments. A combination of gmtime() and timegm() will be the expected way to perform arithmetic upon a time_t value and remain compatible with the ISO C standard (where the internal structure of a time_t is not specified), since attempting such manipulations using localtime() and mktime() can lead to unexpected results.






Viewing Issue Advanced Details
1629 [1003.1(2016/18)/Issue7+TC2] Shell and Utilities Editorial Clarification Requested 2023-01-15 17:30 2023-06-13 11:08
mirabilos
 
normal  
Applied  
Accepted As Marked  
   
mirabilos
mksh
unsure which applies
(page or range of pages)
(Line or range of lines)
---
Note: 0006210
Shell vs. read(2) errors on the script
As indicated in <Pine.BSM.4.64L.2301081426320.6999@herc.mirbsd.org> on the mailing list, both GNU bash <https://savannah.gnu.org/support/index.php?110763> [^] and mksh <https://bugs.launchpad.net/mksh/+bug/2002044> [^] got reports that the shell does not error out on read errors when loading either the first or any subsequent block of the script to execute.

Chet says that treating them as EOF is historical behaviour.

I don’t have a preference either way as I can see benefit in both; in contrast to Chet however I do think that the exit status mandated (if any) does matter and would prefer a high one, or even suggesting that the shell sends itself a suitable signal (PIPE, BUS and HUP, in that order, came to mind) so that the script could even have installed a trap handler to catch this condition beforehand (and clean up).
Decide whether…

① either ⓐ keep to existing behaviour; read errors on the script are treated as EOF, and the shell is still required to exit with the errorlevel of the last command executed (if any; a read error on the first block of a script would equal executing a null command and therefore exit zero),
  or ⓑ that read errors on script input require exiting indicating an error in some way,

and ② if 1b, how shells are supposed to treat these errors; options are at least
  ⓒ some code within 1‥125, as with other errors,
  ⓓ 126 as if the script was not executable (which will require changing 126 as it’s IIRC currently tied to ENOEXEC),
  ⓔ signalling itself with a suitable signal,
  ⓕ exiting with 128+signalnumber of the signal,
  ⓖ any other, possibly high, status.

I dislike 2c (which the bug submitter suggests as he interprets the spec this way currently) for possible confusion with utility exit statūs (grep, diff, cURL, unifdef, etc).
I’m not sure about 2d but it sounds good.
As mentioned above, I’d prefer 2e iff 1b is decided on (I’m similarly good with 1a, I just want a well-argumented decision either way).
If 2e is not palatable, I’d rank 2f almost as high as 2d.
2g has the potential of conflicting with 2f for possibly unrelated signals.
Notes
(0006210)
geoffclare   
2023-03-20 16:21   
Add a row (D2.1 p2330) to the table in 2.8.1 Consequences of Shell Errors:
Unrecoverable read error when reading commands | shall exit *4 | shall exit *4 | yes


and add a new note after the table:
4. If an unrecoverable read error occurs when reading commands, other than from the file operand of the dot special built-in, the shell shall execute no further commands (including any already successfully read but not yet executed) other than any specified in a previously defined EXIT trap action. An unrecoverable read error while reading from the file operand of the dot special built-in shall be treated as a special built-in utility error.


Change P3155, L107009-107011 in the exit status section of the sh utility from:
1-125
A non-interactive shell detected an error other than command_file not found or executable, including but not limited to syntax, redirection, or variable assignment errors.

to:
1-125
A non-interactive shell detected an error other than command_file not found, command_file not executable, or an unrecoverable read error while reading commands (except from the file operand of the dot special built-in); including but not limited to syntax, redirection, or variable assignment errors.

      
Add to the exit status section of the sh utility on P3155 after L107014:
128
An unrecoverable read error was detected while reading commands, except from the file operand of the dot special built-in.
    

    
On D2.1 page 358 line 12462 section <stdlib.h> (RATIONALE), change:
Exit statuses of 126, 127, and greater than 128 are ambiguous in certain circumstances because they have special meanings in the shell (see [xref to XCU 2.8.2]).

to:
Exit statuses of 126 and greater are ambiguous in certain circumstances because they have special meanings in the shell (see [xref to XCU 2.8.2] and the EXIT STATUS section of [xref to XCU sh]).


On D2.1 page 359 line 12469 section <stdlib.h> (RATIONALE), delete:
The value 128 is disallowed for simplicity, making the allowed values 1 to 125 inclusive rather than 1 to 125 inclusive and 128.


After D2.1 page 531 line 18867 section _Exit() (APPLICATION USAGE), add a new paragraph:
Exit statuses of 126 and greater are ambiguous in certain circumstances because they have special meanings in the shell (see [xref to XCU 2.8.2] and the EXIT STATUS section of [xref to XCU sh]).


After D2.1 page 789 line 27009 section exit() (APPLICATION USAGE), add a new paragraph (after applying bug 1490):
See also _Exit().


On D2.1 page 2370 line 76769 section exit (RATIONALE), change:
As explained in other sections, certain exit status values have been reserved for special uses and should be used by applications only for those purposes:

126 A file to be executed was found, but it was not an executable utility.

127 A utility to be executed was not found.

>128 A command was interrupted by a signal.

to:
As explained in other sections, certain exit status values have been reserved for special uses and should be used by applications only for those purposes:

126 A file to be executed was found, but it was not an executable utility.

127 A utility to be executed was not found.

128 An unrecoverable read error was detected by the shell while reading commands, except from the file operand of the dot special built-in.

>128 A command was interrupted by a signal.

On page 3238 line 110033 section tsort (RATIONALE), after applying bug 1617 change:
Implementations are urged to set the implementation-defined maximum number of cycles reported via the exit status to at most 125, leaving 128 for other errors, and leaving 126, 127, and values greater than 128 to have the special meanings that the shell assigns to them. (An implementation that wants to distinguish other types of errors would need to set the maximum to less than 125 so that 128 is not the only code available for those errors).

to:
Implementations are urged to set the implementation-defined maximum number of cycles reported via the exit status to at most 124, leaving values above that maximum through 125 for other errors, and leaving values 126 and greater to have the special meanings that the shell assigns to them.




Viewing Issue Advanced Details
1630 [1003.1(2016/18)/Issue7+TC2] Base Definitions Objection Clarification Requested 2023-01-20 21:39 2023-06-13 11:12
mirabilos
 
normal  
Applied  
Accepted As Marked  
   
mirabilos
mksh
3.10
(page or range of pages)
(Line or range of lines)
---
Note: 0006266
Alias names
I have a strong objection for alias names beginning with either ‘+’ or ‘-’ (or are exactly “[[” but that’s not yet portable anyway).

This used to not be a problem, but with the recent change to POSIX alias name characters, it has become one.

(This was, indeed, discussed with users who wanted to use +/- as aliases, but there were problems and ambiguities stemming from them (mostly wrt. options), and so mksh denies aliases to start with either of these characters.)
   On page 34, lines 1168 ff., change

     3.10 Alias Name
     In the shell command language, a word consisting solely of alphabetics and
     digits from the portable character set and any of the following characters:

   to

     3.10 Alias Name
     In the shell command language, a word consisting solely of alphabetics and
     digits from the portable character set and any of the following characters
     (where '-' may not be used as the first character of the word):
Notes
(0006266)
geoffclare   
2023-04-20 16:07   
(edited on: 2023-04-20 16:09)
On draft 3 page 2462 line 79899 section 2.3.1, change:
• the TOKEN could be parsed as the command name word of a simple command ...
to:
• either the TOKEN is being considered for alias substitution because it follows an alias substitution whose replacement value ended with a <blank> (see below) or the TOKEN could be parsed as the command name word of a simple command ...

After draft 3 page 2565 line 83832 section alias, add:
5. Add the -F option to interactive uses of ls, even when executed as <tt>xargs ls</tt> or <tt>xargs -0 ls</tt>:
alias ls='ls -F'
alias xargs='xargs '
alias -- -0='-0 '
find . [...] -print | xargs ls      # breaks on filenames with \n (two aliases expanded)
find . [...] -print0 | xargs -0 ls  # minimizes \n issues (three aliases expanded)






Viewing Issue Advanced Details
1632 [Issue 8 drafts] Shell and Utilities Objection Clarification Requested 2023-01-31 00:58 2023-05-16 10:55
kre
 
normal  
Applied  
Accepted As Marked  
   
Robert Elz
XCU 2.6.1
2319-2320
74775-74794
Note: 0006184
Tilde expansions with HOME="" or HOME value ending in /
The current draft, and earlier versions, this is not a new problem, it
could have been filed against any older version - but as 2.6.1 has already
been updated for other reasons for Issue 8, it seemed sensible to use its
most recent draft, don't specify what should happen if HOME="" or if the
value of HOME ends in a '/' - including particularly, the case of HOME=/

The case of HOME=/ is perhaps most interesting, as if the text is treated
literally, which almost all shells do, the result of expanding ~/foo in
such a case is //foo which is an unspecified pathname (as I understand it).

A recent ksh93 produces /foo from this case, but that's the only shell I
can find which does (amusingly, for the case of HOME=/dir/ that version
of ksh93 produces /dir//foo for ~/foo whereas an older one produces //foo
for HOME=/ and /dir/foo for the HOME=/dir/ case)..

mksh also makes /dir/foo in the HOME=/dir/ case and //foo in the HOME=
case, most other shells simply do as the standard says, and replace the tilde
prefix (just '~' here) with the value of HOME, which results in the // in both
cases.

This leads to the desire to use HOME="" instead of HOME=/ in the case where
the home directory is intended to be the root directory, which results in
~/foo expanding to /foo in almost all shells (the now quite old FreeBSD shell
I use to test with doesn't expand ~ at all when $HOME is an empty string,
but that is. or quite likely was, clearly a bug, so I assume it has been
fixed sometime in the past several years).

However, despite the language of the standard:

   The pathname resulting from tilde expansion shall be treated as if
   quoted to prevent it being altered by field splitting and pathname
   expansion.

most shells expand a word which is just ~ to nothing if HOME='' rather
than (effectively) to "" which that sentence seems to require. Only
bash, the NetBSD shell, and the older version of ksh93 made "".
(Again, the FreeBSD shell just produces ~ here, for the same reason as above).

This all needs to be cleaned up.
I'd suggest requiring that if a pathname resulting from a tilde espansion
(a term which really ought be better defined - I assume that in ~/foo
where HOME=/dir that the pathname intended is "/dir" not "/dir/foo"
otherwise ~/*.c would never be able to work) ends in a / that if a /
occurs as the next character of the word containing the tilde expansion
be omitted from the result - as that produces better results overall
(despite it going to require changes in most shells, including the one
I maintain).

I'd also suggest being more explicit either than when HOME='' ~ expands
to "" or nothing (it makes no difference if there is more to the word).
For this, I'd prefer "" (for both the obvious reason, and because it is
more consistemt with both the current wording, and useful behaviour).
Notes
(0006184)
geoffclare   
2023-03-02 16:42   
On page 2320 line 74793 section 2.6.1, change:
The pathname resulting from tilde expansion shall be treated as if quoted to prevent it being altered by field splitting and pathname expansion.
to:
The pathname that replaces the tilde-prefix shall be treated as if quoted to prevent it being altered by field splitting and pathname expansion; if a <slash> follows the tilde-prefix and the pathname ends with a <slash>, the trailing <slash> from the pathname should be omitted from the replacement. If the word being expanded consists of only the <tilde> character and HOME is set to the null string, this produces an empty field (as opposed to zero fields) as the expanded word.
<small>Note: A future version of this standard may require that if a <slash> follows the tilde-prefix and the pathname ends with a <slash>, the trailing <slash> from the pathname is omitted from the replacement.</small>




Viewing Issue Advanced Details
1634 [1003.1(2016/18)/Issue7+TC2] Shell and Utilities Editorial Enhancement Request 2023-02-18 20:11 2023-05-16 10:59
steffen
 
normal  
Applied  
Accepted As Marked  
   
steffen98141
mailx
2960
98141-2
---
See Note: 0006193.
mailx: more clarification on system mailbox
The current text "suggests" the commands work in any
mailbox, but the code bases shortcut to "edstop()"
unless in a primary (system) mailbox, and only then
$MBOX comes into play at all.

(My own clone has a TODO to change that for the "mbox"
command, as that always makes sense, like $MBOX being
the default for the "save" command, but that
off-topic. And was issue #991.)
on page 2960, line 98141-2, "touch", change

        Touch the specified messages. If any message in
        msglist is not specifically deleted nor saved
        in a file, it shall be placed in the mbox upon
        normal termination. See exit and quit.

to

        Touch the specified messages. If the current
        mailbox is the system mailbox any message in
        msglist that is not specifically deleted nor
        saved in a file shall be placed in the mbox
        upon normal termination. See exit and quit.
Notes
(0006165)
steffen   
2023-02-21 23:01   
One more iteration on that that clarifies which commands strip
which flags again.
This now correctly reflects what V10 Mail, BSD Mail of Apple of
2015, OpenBSD Mail of 2023-01-28, and s-nail (devel) do.
It also adds a possible change for the hold variable.

on page 2957, line 98020 ff., "hold" / "preserve", change

        Mark the messages in msglist to be retained in the mailbox
        when mailx terminates. This shall override any commands
        that might previously have marked the messages to be
        deleted. During the current invocation of mailx, only the
        delete, dp, or dt commands shall remove the preserve
        marking of a message.

to

        Allowed only in the system mailbox.
        Mark the messages in msglist to be preserved, as if the
        hold variable were set, upon normal termination, or when
        the folder is changed.
        This shall override any commands that might previously
        have marked the messages to be deleted,
        and only the delete, dp, or dt, as well as the mbox and
        touch commands shall remove the preserve mark of a message.

on page 2960, line 98141-2, "touch", change

        Touch the specified messages.
        If any message in msglist is not specifically deleted nor
        saved in a file, it shall be placed in the mbox upon
        normal termination.
        See exit and quit.

to

        Allowed only in the system mailbox.
        Touch the specified messages.
        Unless overriden by the hold variable, any message in
        msglist that is not specifically deleted nor saved in
        a file shall be placed in the mbox upon normal
        termination, or when the folder is changed.
        Overrides a former hold or preserve request.

Furthermore the solution of issue 991 for the "mbox" command has
to be furtherly refined.
Its editor notes shall instead read

        Allowed only in the system mailbox.
        Arrange for the given messages to end up in the secondary
        mailbox, overriding a possibly set hold variable, upon
        normal termination, or when the folder is changed.
        Overrides a former hold or preserve request.

Ditto. Let's center on the variable.
On page 2952, lines 97820 ff, variable "hold", change

        hold
        Preserve all messages that are read in the system mailbox
        instead of putting them in the mbox save file. The default
        shall be nohold.

to

        hold
        Disable message moving of read messages from the system
        mailbox to the mbox save file upon normal program
        termination or folder change.
        This automatic email management is complemented with the
        commands hold (and preserve), mbox, and touch, which
        partially override the hold variable.
        The default shall be nohold.

On page 2953, lines 97972 ff, "exit", "xit", change

        Exit from mailx without changing the mailbox.
        No messages shall be saved in the mbox (see also quit).

to

        Exit from mailx without performing automatic message
        moving, or any other management tasks.
        Also see quit.

On page 2959, lines 98078 ff, "quit", "EOF", change

        Terminate mailx, storing messages that were read in mbox
        (if the current mailbox is the system mailbox and unless
        hold is set), deleting messages that have been explicitly
        saved (unless keepsave is set), discarding messages that
        have been deleted, and saving all remaining messages in
        the mailbox.

to

        Terminate mailx normally.
        Dependent upon the conditions documented for the variable
        hold this may perform automatic message moving.
        It will delete messages that have been explicitly saved
        (unless keepsave is set), discard messages that have been
        deleted, and save all remaining messages in the mailbox.

(Better yet, move that to the folder command, instead of vice
versa, and only say

        It will quit the folder and perform management task as
        documented there.)

Thank you.
(0006193)
Don Cragun   
2023-03-06 16:27   
Make the changes in Note: 0006165 except for the line 98078 change.

On page 2956 lines 97977 ff, fi[le] [file], fold[er] [file], change:
Quit (see the quit command) from the current file of messages and read in the file named by the pathname file. If no argument is given, the name and status of the current mailbox will be written.
to:
If no argument is given, write the name and status of the current mailbox. Otherwise, close the current file of messages after performing actions as specified for the quit command (except for terminating mailx) and then read in the file named by the pathname file. The behavior is unspecified if file is not a valid mbox.


On page 2959, lines 98078 ff, "quit", "EOF", change:
Terminate mailx, storing messages that were read in mbox (if the current mailbox is the system mailbox and unless hold is set), deleting messages that have been explicitly saved (unless keepsave is set), discarding messages that have been deleted, and saving all remaining messages in the mailbox.
to:
Terminate mailx normally, performing automatic message moving as specified in the description of the variable hold, deleting messages that have been explicitly saved (unless keepsave is set), discarding messages that have been deleted, and saving all remaining messages in the mailbox.




Viewing Issue Advanced Details
1636 [1003.1(2016/18)/Issue7+TC2] System Interfaces Objection Error 2023-02-23 11:57 2023-05-16 11:01
geoffclare
 
normal  
Applied  
Accepted  
   
Geoff Clare
The Open Group
pthread_sigmask()
1734
56226
---
See Desired Action
pthread_sigmask() equivalence to sigprocmask()
The description of pthread_sigmask() says it is equivalent to sigprocmask() except for the single-thread restriction. This omits the exception that the error return convention is different.

In RETURN VALUE there is a clause for sigprocmask() that is both redundant and incorrect: "and the signal mask of the process shall be unchanged". (It is redundant because of line 56250, and incorrect because "process" should be "thread".)

Also, in ERRORS, the statement:
The pthread_sigmask() function shall not return an error code of [EINTR].
is made for pthread_sigmask(), whereas all other requirements that rely on the equivalence in order to apply to both functions are stated for sigprocmask(). It could be changed to sigprocmask(), but given that the first line of the section is:
The pthread_sigmask() and sigprocmask() functions shall fail if:
it would be better to switch to the usual convention of saying "these functions" in the ERRORS section.

Finally, rather than fix these problems by the minimum necessary changes, the description would read better if it is rearranged to describe pthread_sigmask() first and then sigprocmask().
On page 1734 line 56225 section pthread_sigmask(), change:
The pthread_sigmask() function shall examine or change (or both) the calling thread's signal mask, regardless of the number of threads in the process. The function shall be equivalent to sigprocmask(), without the restriction that the call be made in a single-threaded process.

In a single-threaded process, the sigprocmask() function shall examine or change (or both) the signal mask of the calling thread.
to:
The pthread_sigmask() function shall examine or change (or both) the calling thread's signal mask.

On page 1734 line 56243-56250 section pthread_sigmask(), change (3 occurrences):
sigprocmask()
to:
pthread_sigmask()

On page 1734 line 56251 section pthread_sigmask(), change:
The use of the sigprocmask() function is unspecified in a multi-threaded process.
to:
The sigprocmask() function shall be equivalent to pthread_sigmask(), except that its behavior is unspecified if called from a multi-threaded process, and on error it returns -1 and sets errno to the error number instead of returning the error number directly.

On page 1734 line 56255 section pthread_sigmask(), change:
otherwise, -1 shall be returned, errno shall be set to indicate the error, and the signal mask of the process shall be unchanged.
to:
otherwise, -1 shall be returned and errno shall be set to indicate the error.

On page 1735 line 56258 section pthread_sigmask(), change:
The pthread_sigmask() and sigprocmask() functions shall fail if
to:
These functions shall fail if

On page 1735 line 56260 section pthread_sigmask(), change:
The pthread_sigmask() function shall not return an error code of [EINTR].
to:
These functions shall not return an error code of [EINTR].

There are no notes attached to this issue.




Viewing Issue Advanced Details
1637 [1003.1(2016/18)/Issue7+TC2] Shell and Utilities Objection Clarification Requested 2023-02-27 17:17 2023-05-16 11:04
nick
 
normal  
Applied  
Accepted As Marked  
   
Nick Stoughton
Logitech
command
2596
84263-84266
---
Note: 0006203
The command utility does not execute aliases
The description of command with no options given states:

If the command_name is the same as the name of one of the special built-in utilities, the special
properties in the enumerated list at the beginning of Section 2.14 (on page 2384) shall not occur.
In every other respect, if command_name is not the name of a function, the effect of command
(with no options) shall be the same as omitting command.

This suggests that
alias x=ls
command x

should run the alias "x". No existing shell or implementation of the "command" utility I have found does this, but instead says that "x is not found" (in some form).
In D2.1, page 2553, line 83836, change

... if command_name is not the name of a function,

to

... if command_name is not the name of a function or alias,
Notes
(0006203)
geoffclare   
2023-03-13 15:52   
(edited on: 2023-03-13 15:56)
After:
If the command_name is the same as the name of one of the special built-in utilities, the special properties in the enumerated list at the beginning of Section 2.14 (on page 2384) shall not occur. In every other respect, if command_name is not the name of a function, the effect of command (with no options) shall be the same as omitting command .
add:
, except that command_name does not appear in the command word position in the command command, and consequently is not subject to alias substitution (see [xref to 2.3.1]) nor recognized as a reserved word (see [xref to 2.4]).






Viewing Issue Advanced Details
1638 [Issue 8 drafts] Base Definitions and Headers Objection Clarification Requested 2023-03-03 10:03 2023-05-16 11:06
kre
 
normal  
Applied  
Accepted As Marked  
   
Robert Elz
XBD 8.3
162
5648-5651
See Note: 0006204.
Requirement that TZ "std" and "dst" be 3 chars long (when given) is apparently ambiguous
This could be filed against 7 TC2, and I think probably D3 (when it
appears as well) - I don't think the text has changed.

It wasn't changed by 0001619 though perhaps should have been.

The current (as in D2.1) text says:

   The interpretation of these fields is unspecified if either field is
   less than three bytes (except for the case when dst is missing),

To me that was always clear enough, it means that "dst" doesn't have to be
at least 3 chars, if it was omitted, that would be absurd. But on a (unrelated) mailing list, I have just seen a claim:

   When /dst/ is missing, /std/ can be less than 3 bytes.

which is obviously based upon reading that "except" as applying to both
"std" and "dst" rather than just "dst" which I have always assumed.

And when the text is read, without already having a preconceived idea of
what it is intended to mean, I can see how that interpretation is possible.

Beyond that, for all parts of the POSIX TZ strip specification, now that
0001619 has been applied, we really need to remove all "is unspecified"
in invalid cases - those should now be simply invalidating the string as
being considered as a POSIX TZ string, leaving it open to be considered as
one of the new form added by 0001619.

That part I am not going to supply new text for here, that can wait until
after D3 is available, when we know what we're working with.
Replace the paragraph in lines 5648-5651 (D2.1) on page 162

The interpretation of these fields is unspecified if either
field is less than three bytes (except for the case when dst is missing),
more than {TZNAME_MAX} bytes, or if they contain characters
other than those specified.


with (something like - and here I am making the working, for just this one
case for now, invalidate the string, as being a POSIX TZ string):

If std contains less than 3 bytes, or dst
(if present in the string) contains less than 3 bytes, or if either
std or dst contain more than {TZNAME_MAX} bytes, or
if either of those fields contains any characters other than those
specified, the string will not be a valid string of this format, and
shall be considered as a candidate for being of the third format.


(Yes, I know, that's ugly - someone else can do better...)
Notes
(0006204)
Don Cragun   
2023-03-16 15:26   
Change:
The interpretation of these fields is unspecified if either field is less than three bytes (except for the case when dst is missing), more than {TZNAME_MAX} bytes, or if they contain characters other than those specified.
to:
The interpretation of std and, if present, dst is unspecified if the field is less than three bytes or more than {TZNAME_MAX} bytes, or if it contains characters other than those specified.




Viewing Issue Advanced Details
1639 [Issue 8 drafts] Base Definitions and Headers Objection Clarification Requested 2023-03-05 06:57 2023-05-16 11:07
kre
 
normal  
Applied  
Accepted As Marked  
   
Robert Elz
XBD 8.3 TZ
162
5643-5644
Note: 0006205
Clarify minimun length requirement of "quoted" std and dst names in POSIX TZ string.
In the description of the POSIX TZ string, in the XBD 8.3 section that
describes the TZ variable, it says:

std and dst Indicate no less than three, nor more than {TZNAME_MAX},
                  bytes that are the designation for the standard (std)

and that's fine, then in the description of the quoted form encoding
of those fields, we have:

Each of these fields may occur in either of two formats quoted or
unquoted:

     -- In the quoted form, the first character shall be the <less-
        than-sign> ('<') character and the last character shall be
        the <greater-than-sign> ('>') character.

That's fine too.

Then the description (after text about the allowed chars, not relevant
here) it concludes:

        The std and dst fields in this case shall not include the
        quoting characters.

And that is OK too, so if we had
        TZ='<+0700>-7'
that would be OK, and the tzname for standard time would be +0700
(with an offset 7 hours ahead of UTC) with the '<' and '>' chars
not included (and no summer time applies).

That's all fairly straight forward.

However, it turns out there is an ambiguity there (or at least it
seems like there is).

The way I read this, is that the std (and dst if given) fields must
be 3 bytes long (at least - forget the max for now), and those 3
bytes can include the '<' and '>' chars, which would make

        TZ='<Z>0'

also acceptable. That is, I read the sentence:

        The std and dst fields in this case shall not include the
        quoting characters.

as meaning that the names extracted from the std and dst fields
don't include the quote characters (and for convenience they're
still called "std" and "dst" as the rest of the text refers to
them that way) - the "minimum 3 bytes" test has already been
satisfied.

Others apparently read that as if it said "when quoted, the minimum
length of the field is 5 bytes, including the quotes, leaving at
least 3 bytes for the extracted name."

It was even claimed that POSIX requires tzname (timezone name
abbreviations -- no-one has ever invented a good term to use to
refer to these things) fields to be at least 3 chars (bytes) long,
because of this rule.

But that's clearly not the case, as in the ':' form (or in the
new "geographic/special" form added by 0001619 there is no
such requirement, it is all implementation defined. Nothing I
can find elsewhere (which doesn't mean it doesn't exist) places
any limits (minimum, maximum, or allowed characters) upon those
values, the limits are there only for the sake of parsing the
POSIX TZ string.

Note that if I am incorrect, and the intent really was to require
at least 3 bytes between the '<' '>' quoting characters, then
first, why? And second that must mean that, assuming
TZNAME_MAX=10 (no idea if it is allowed to be or not, I am picking
10 because strlen("TZNAME_MAX") == 10, then
        TZ='<TZNAME-MAX>12'
would be a legitimate TZ string, where here there are 12 chars
(that is, more than TZNAME_MAX) in the "std" field as it first
appears. [Aside: the change in the example from '_' to '-' was
deliberate, '-' is an allowed character there, '_' probably is
not].

Note the "Why?" there is not rhetorical, an implementation can
choose to accept
        TZ=':<Z>0'
as a legitimate TZ string (because of the ':' the format is
implementation defined) producing a one character tzname (abbreviation),
in this particular case, it isn't even an odd thing to want to do.
That is, applications need to deal with short tzname (tm_zone) values
anyway, restricting the normal quoted format to prevent it seems kind
of petty.

The unquoted form I understand, that has been around since the
dark ages, and 3 char (3 byte) abbreviations were what it initially
allowed - altering that would be hard. The quoted form is much
newer (and far less commonly used).
Change the sentence in XBD 8.3 TZ variable description,
on page 162 of I8 D2.1, lines 5643-4, from:

The std and dst fields in this case
shall not include the quoting characters.

to be:
The tzname abbreviations obtained from the std
and dst fields in this case shall not include the
quoting characters, but shall be referred to as the std
or dst fields, as appropriate, throughout the remainder
of this specification, the requirement that the fields be at
least 3 bytes long applies to the quoted form, the minimum length
after the quoting characters have been removed is 1 byte.


There is no need to say anything about the maximum length of
the quoted fields, as removing the '<' and '>' cannot make
them become longer, only shorter.

If this is not to be the intent of the standard (again, why?)
then after the sentence quoted above (lines 5643-4 on page 162
of I8 D2.1) add a new sentence.

Note: In this case, the quoted std, and
dst (if present) fields shall have a minimum length
of 5 bytes, and a maximum length of {TZNAME_MAX}+2 bytes.


But why?
Notes
(0006205)
geoffclare   
2023-03-16 15:39   
Change:
The std and dst fields in this case shall not include the quoting characters.
to:
The std and dst fields in this case shall not include the quoting characters and the quoting characters do not contribute to the three byte minimum length and {TZNAME_MAX} maximum length.




Viewing Issue Advanced Details
1640 [1003.1(2016/18)/Issue7+TC2] Shell and Utilities Objection Error 2023-03-12 07:00 2023-05-16 11:09
kre
 
normal  
Applied  
Accepted As Marked  
   
Robert Elz
XCU 3 / true
3318
111745 - 111748
---
Note: 0006206
The rationale given for retaining "true" is nonsense.
The RATIONALE section of the page for the "true" utility says:

    The true utility has been retained in this volume of POSIX.1-2017,
    even though the shell special built-in : provides similar functionality,
    because true is widely used in historical scripts and is less cryptic to
    novice script readers.

That text remains unchanged in Issue 8 draft 2.1

The functionality is only vaguely similar, true is a normal utility, ':' is
a special builtin, hence the consequences of redirection errors are
different, and use, redirections are used with these utilities.

Further, the OPERANDS listed for "true" are "None" which XCU 1.4 says
means "When this section is listed as ``None.'', it means that the
implementation need not support any operands.", which allows an
implementation to do things with operands if it wants, including issueing
an error message failing (turning info "false"). While none do, that I
am aware of (true is generally, and entirely, "exit 0" or "exit(0)" in C)
it is possible.

Finally, since this bug is being submitted against Issue 7 TC2,
XCU 2.9.1.1 bullet point 'd' says:

     If the command name matches the name of the type or ulimit utility,
     or of a utility listed in the following table, that utility shall be
     invoked.

Note "shall be invoked" and "true" is in the table. If there were no
"true" utility, that would be impossible, so deleting true really could
not have happened (back then) no matter how redundant it seemed to be.

Note that in Issue 8 draft 2.1, this has altered, it is now 2.9.1.4
(still bullet point d) but that now refers to the intrinsic utilities
defined in XCU 1.7, and "true" is not in that list.
Delete the entire RATIONAL section (lines 111746 - 111748) and replace
them with
None.

Notes
(0006206)
geoffclare   
2023-03-16 15:54   
On page 3317 line 111737 section true (APPLICATION USAGE), change:
The special built-in utility : is sometimes more efficient than true.
to:
Although the special built-in utility : (colon) is similar to true, there are some notable differences, including:

  • Whereas colon is required to accept, and do nothing with, any number of arguments, true is only required to accept, and discard, a first argument of "--". Passing any other argument(s) to true may cause its behavior to differ from that described in this standard.

  • A non-interactive shell exits when a redirection error occurs with colon (unless executed via command), whereas with true it does not.

  • Variable assignments preceding the command name persist after executing colon (unless executed via command), but not after executing true.

  • In shell implementations where true is not provided as a built-in, using colon avoids the overheads associated with executing an external utility.

On page 3318 line 111746 section true, replace the contents of RATIONALE with:
None.

On page 3318 line 111752 section true, add colon and command to SEE ALSO.

On page 2389 line 76461 section colon, change APPLICATION USAGE from:
None.
to:
See the APPLICATION USAGE for true.

On page 2390 line 76479 section colon, add true to SEE ALSO.




Viewing Issue Advanced Details
1641 [1003.1(2016/18)/Issue7+TC2] System Interfaces Editorial Clarification Requested 2023-03-18 07:52 2023-08-17 10:51
bastien
 
normal  
Applied  
Accepted As Marked  
   
Bastien Roucaries
debian
sys/socket.h
Application usage
sockaddr_storage
Approved
see Note: 0006290
sockaddr_storage is not alias safe
 sockaddr_storage was designed back when strict aliasing wasn’t a problem.

 Back then, one would define a variable of that type, and then access it as any of the other sockaddr_* types, depending on the value of the first member. This is Undefined Behavior.

However, there is no
way to use these APIs without invoking Undedfined Behavior, either in
the user program or in libc, so it is still recommended to use this
method. The only correct way to use different types in an API is
through a union.

Exemple of safe use
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <sys/un.h>
#include <stdio.h>
#include <unistd.h>
#include <errno.h>
#include <stddef.h>

union sockaddr_mayalias {
  sa_family_t ss_family;
  struct sockaddr sock;
  struct sockaddr_storage storage;
  struct sockaddr_in in;
  struct sockaddr_in6 in6;
  struct sockaddr_un un;
};
  
int main() {
  union sockaddr_mayalias sa = {};
  socklen_t addrlen = sizeof(sa);
  if(getsockname(STDIN_FILENO, &sa.sock, &addrlen) < 0) {
    perror("getsockname");
    return 1;
  }
  if(addrlen >= sizeof(sa)) {
    errno = EPROTONOSUPPORT;
    perror("getsockname return a not supported sock_addr");
    return 1;
  }
  
  switch(sa.ss_family) {
  case(AF_UNSPEC):
    printf("AF_UNSPEC socket\n");
    break;
  case(AF_INET):
    {
      char s[INET_ADDRSTRLEN];
      in_port_t port = ntohs(sa.in.sin_port);
      if (inet_ntop(AF_INET, &(sa.in.sin_addr), s, sizeof(s)) == NULL) {
    perror("inet_ntop");
    return 1;
      }
      printf("AF_INET socket %s:%i\n",s,(int)port);
      break;
    }
  case(AF_INET6):
    {
      char s[INET6_ADDRSTRLEN];
      in_port_t port = ntohs(sa.in6.sin6_port);
      if (inet_ntop(AF_INET6, &(sa.in6.sin6_addr), s, sizeof(s)) == NULL) {
    perror("inet_ntop");
    return 1;
      }
      printf("AF_INET6 socket %s:%i\n",s,(int)port);
      break;
    }
  case(AF_UNIX):
    if(addrlen == sizeof(sa_family_t)) {
      printf("AF_UNIX socket anonymous\n");
      break;
    }
    /* abstract */
    if(sa.un.sun_path[0]=='\0') {
      printf("AF_UNIX abstract socket 0x");
      for (int i = 0; i < (addrlen - sizeof(sa_family_t)); ++i)
    printf("%x",sa.un.sun_path[i]);
      printf("\n");
      break;
    }
    /* named */
    printf("AF_UNIX named socket ");
    for (int i=0; i < strnlen(sa.un.sun_path, addrlen - offsetof(struct sockaddr_un, sun_path));++i)
      printf("%c",sa.un.sun_path[i]);
    printf("\n");
    break;
  default:
      errno = EPROTONOSUPPORT;
      perror("socket not supported");
      return 1;
}

    
}
1. document aliasing problem
2. define sockaddr storage as:
struct sockaddr_storage {
        union {
                sa_family_t ss_family;
                struct sockaddr sa;
                struct sockaddr_in sin;
                struct sockaddr_in6 sin6;
                struct sockaddr_un sun;
                struct _sockaddr_padding padding;
        };
};
Notes
(0006290)
eblake   
2023-05-25 16:23   
Interpretation response
------------------------
The standard clearly states that when a pointer to a sockaddr_storage structure is cast as a pointer to a sockaddr structure, the ss_family field of the sockaddr_storage structure maps onto the sa_family field of the sockaddr structure and when a pointer to a sockaddr_storage structure is cast as a pointer to a protocol-specific address structure, the ss_family field maps onto a field of that structure that is of type sa_family_t and that identifies the protocol’s address family, and conforming implementations must conform to this.

Rationale:
-------------
In stating these field mapping requirements when a cast operator is applied to the various socket address structures, the standard defines the behavior in circumstances where the behavior is undefined in the ISO C standard. The onus is on implementations to ensure that these mappings are as described in the standard, making use of implementation-specific extensions if necessary, even though this is not stated explicitly.

Notes to the Editor (not part of this interpretation):
-------------------------------------------------------
On page 386 line 13115 section <sys/socket.h> DESCRIPTION, change:

    
When a pointer to a sockaddr_storage structure is cast as a pointer to a sockaddr structure, the ss_family field of the sockaddr_storage structure shall map onto the sa_family field of the sockaddr structure. When a pointer to a sockaddr_storage structure is cast as a pointer to a protocol-specific address structure, the ss_family field shall map onto a field of that structure that is of type sa_family_t and that identifies the protocol’s address family.


to:

    
When a pointer to a sockaddr_storage structure is converted to a pointer to a sockaddr structure, or vice versa, the ss_family member of the sockaddr_storage structure shall map onto the sa_family member of the sockaddr structure. When a pointer to a sockaddr_storage structure is converted to a pointer to a protocol-specific address structure, or vice versa, the ss_family member shall map onto a member of that structure that is of type sa_family_t that identifies the protocol’s address family. When a pointer to a sockaddr structure is converted to a pointer to a protocol-specific address structure, or vice versa, the sa_family member shall map onto a member of that structure that is of type sa_family_t that identifies the protocol’s address family. Additionally, the structures shall be defined in such a way that the compiler treats an access to the stored value of the sa_family_t member of any of these structures, via an lvalue expression whose type involves any other one of these structures, as permissible, despite the more restrictive expression rules on stored value access as stated in the ISO C standard. Similarly, when a pointer to a sockaddr_storage or sockaddr structure is converted to a pointer to a protocol-specific address structure, the compiler shall treat an access (using this converted pointer) to the stored value of any member of the protocol-specific structure as permissible. The application shall ensure that the protocol-specific address structure corresponds to the family indicated by the member with type sa_family_t of that structure and the pointed-to object has sufficient memory for addressing all members of the protocol-specific structure.


On page 390 line 13260 section <sys/socket.h> APPLICATION USAGE, append a sentence:

    
Note that this example only deals with size and alignment; see RATIONALE for additional issues related to these structures.


On page 390 line 13291 section <sys/socket.h>, change RATIONALE from "None" to:

    
Note that defining the sockaddr_storage and sockaddr structures using only mechanisms defined in early editions of the ISO C standard may produce aliasing diagnostics when applications use casting between pointers to the various socket address structures. Because of the large body of existing code utilizing sockets in a way that could trigger undefined behavior due to strict aliasing rules, this standard mandates that these structures can alias each other for accessing the sa_family_t member of the structures (or other members for protocol-specific structure references), so as to preserve well-defined semantics. An implementation's header files may need to use anonymous unions, or even an implementation-specific extension, to comply with the requirements of this standard.




Viewing Issue Advanced Details
1642 [1003.1(2016/18)/Issue7+TC2] Base Definitions and Headers Editorial Clarification Requested 2023-03-19 09:10 2023-06-13 10:43
bastien
 
normal  
Applied  
Accepted As Marked  
   
Bastien Roucaries
debian
any
any
any
---
Note: 0006213
DUMB terminal is not defined
Hi,

They are no definition of a dumb terminal used in vi and ed pages.

- Define that is a dumb terminal
- Define how to detect a dumb terminal
Notes
(0006213)
geoffclare   
2023-03-21 12:10   
(edited on: 2023-03-21 13:19)
There is only one use of "dumb" in normative text, which is in the description of the redraw edit option (P2739 L89670).

In the description of the -s option, the phrase "the terminal is a type incapable of supporting open or visual modes" is used instead and the rationale explains "The terminal type ``incapable of supporting open and visual modes'' has historically been named ``dumb''". Given that this statement exists, I don't see any problem with the use of "dumb terminal" elsewhere in non-normative text.

Proposed change:

On page 2739 line 89670 section ex, change:
The editor simulates an intelligent terminal on a dumb terminal.
to:
If redraw is set and the terminal is a type incapable of supporting open or visual modes, the editor shall redraw the screen when necessary in order to update its contents.






Viewing Issue Advanced Details
1643 [1003.1(2016/18)/Issue7+TC2] System Interfaces Editorial Error 2023-03-21 11:13 2023-06-13 10:45
bhaible
 
normal  
Applied  
Accepted  
   
Bruno Haible
GNU
fprintf
913
30960
---
fprintf %lc: wrong reference to the current conversion specification
The reference to "the wint_t argument to the ls conversion specification" makes no sense, since an ls conversion specification takes a wchar_t* argument.

In ISO C 99 this paragraph reads
"If an l length modifier is present, the wint_t argument is converted as if by
an ls conversion specification with no precision and an argument that points
to the initial element of a two-element array of wchar_t, the first element
containing the wint_t argument to the lc conversion specification and the
second a null wide character."

So, apparently POSIX and ISO C 99 are based on a common ancestor document, and the reference to "ls" has been corrected in ISO C 99, i.e. changed to "lc".
Change "the wint_t argument to the ls conversion specification" to "the wint_t argument to the lc conversion specification".
Notes




Viewing Issue Advanced Details
1644 [1003.1(2016/18)/Issue7+TC2] System Interfaces Comment Enhancement Request 2023-03-22 09:52 2023-06-13 10:48
bastien
 
normal  
Applied  
Accepted As Marked  
   
Bastien Roucaries
debian
dlsym - get the address of a symbol from a symbol table handle
Application usage
all
---
Note: 0006272
void * to function pointer is described in annex J of C standard (informative).
Standard say
  Note that conversion from a void * pointer to a function pointer as in:

   fptr = (int (*)(int))dlsym(handle, "my_function");

  is not defined by the ISO C standard. This standard requires this conversion to work correctly on conforming implementations.
J.5.7 Function pointer casts
1 A pointer to an object or to void may be cast to a pointer to a function, allowing data to
be invoked as a function (6.5.4).
2 A pointer to a function may be cast to a pointer to an object or to void, allowing a
function to be inspected or modified (for example, by a debugger) (6.5.4).
It is not true, this behavior is described in J - Portability issues, §J.5.7 Function pointer cast (informative)

Add a note that POSIX conforment compiler should implement §J.5.7 Function pointer casts
1 A pointer to an object or to void may be cast to a pointer to a function, allowing data to
be invoked as a function (6.5.4).
2 A pointer to a function may be cast to a pointer to an object or to void, allowing a
function to be inspected or modified (for example, by a debugger) (6.5.4).
Notes
(0006272)
geoffclare   
2023-04-27 15:31   
On page 746 line 25418 section dlsym(), change:
cast to a pointer to the type of the named symbol
to:
converted from type pointer to void to a pointer to the type of the named symbol




Viewing Issue Advanced Details
1645 [1003.1(2016/18)/Issue7+TC2] System Interfaces Objection Clarification Requested 2023-03-22 19:47 2023-08-17 10:53
eblake
 
normal  
Applied  
Accepted As Marked  
   
Eric Blake
Red Hat
ebb.execvp
XSH exec
784
26548
Approved
Note: 0006281
execvp( ) requirements on arg0 are too strict
The standard is clear that execlp() and execvp() cannot fail with ENOEXEC (except in the extremely unlikely event that attempting to overlay the process with sh also fails with that error), but must instead attempt to re-execute sh with a command line set so that sh will execute the desired filename as a shell script. Furthermore, the standard is explicit that the original:
execvl(file, arg0, arg1, ..., NULL)

is retried as:
execl(shell path, arg0, file, arg1, ..., NULL)


that is, whatever name was passed in argv[0] in the original attempt should continue to be the argv[0] seen by the sh process that will be parsing file.

But in practice, this does not actually happen on a number of systems. Here is an email describing bugs found in three separate projects (busybox, musl libc, and glibc) while investigating why attempting to rely on what the standard says about execvp() fallback behavior fails on Alpine Linux:
https://listman.redhat.com/archives/libguestfs/2023-March/031135.html [^]

In particular:
1. busybox installs /bin/sh as a multi-name binary, whose behavior DEPENDS on argv[0] ending in a basename of sh. If execvp() actually calls execl("/bin/sh", arg0, file, ...), the binary installed at /bin/sh will NOT see 'sh' as its basename but instead whatever is in arg0, and fails to behave as sh. (Bug filed at https://bugs.busybox.net/show_bug.cgi?id=15481 [^] asking the busybox team to consider installing a minimal shim for /bin/sh that is NOT dependent on argv[0])
2. musl currently refuses to do ENOEXEC handling (a knowing violation of POSIX, but the alternative requires coordinating the allocation of memory to provide the space for the larger argv entailed by injecting /bin/sh into the argument list); see https://www.openwall.com/lists/musl/2020/02/12/9 [^] which acknowledges the issue, where Adélie Linux has patched musl for POSIX compliance but upstream musl does not like the patch. This followup mail surveyed the behavior of various other libc; many use VLA to handle things, but musl argues that VLA is itself prone to bugs https://www.openwall.com/lists/musl/2020/02/13/3. [^] Arguably, musl's claim that execvp() must be safe to use after vfork() can therefore not use malloc() is a bit of a stretch (the standard explicitly documents that execlp() and execvp() need not be async-signal-safe; and even though we've deprecated vfork(), the arguments about what is safe after vfork() roughly correspond to the same arguments about what async-signal-safe functions can be used between regular fork() and exec*()).
3. glibc does ENOEXEC handling, but passes "/bin/sh" rather than arg0 as the process name of the subsequent shell invocation, losing any ability to expose the original arg0 to the script. https://sourceware.org/git/?p=glibc.git;a=blob;f=posix/execvpe.c;h=871bb4c4#l51 [^] shows that the fallback executes is the equivalent to execl("/bin/sh", "/bin/sh", file, arg1, ...)

Admittedly, Linux in general, and particularly Alpine Linux, will intentionally diverge from POSIX any time they feel it practical; but we should still consider whether the standard is too strict in requiring argv[0] to pass through unchanged to the script when the fallback kicks in. And I think the real intent is less about what sh's argv[0] is, and more about what the script's $0 is.

Even historically, FreeBSD used to pass in "sh" rather than preserving arg0, up until 2020: https://cgit.freebsd.org/src/commit/?id=301cb491ea. [^] And _requiring_ arg0 to be used unchanged falls apart when a user invokes execlp("binary", NULL, NULL) (such behavior is non-conforming, since line 26559 states "The argument arg0 should point to a filename string that is associated with the process being started by one of the exec functions.", but a fallback to execl("/bin/sh", NULL, "binary", NULL) obviously won't do what is intended, so the library has to stick something there).

Why don't we see complaints about this more frequently? Well, for starters, MOST people install shell scripts (or even scripts designed for other interpreters) with a #! shebang line. The standard is explicit that this is outside the realm of the standards (because different systems behave differently on how that first line is parsed to determine which interpreter to invoke), but at least on Linux, a script with a #! line NEVER fails with ENOEXEC - that aspect is handled by the kernel. The only time you ever get to a glibc or musl fallback that even has to worry about ENOEXEC is when the script has no leading #! line, which tends to not be common practice (even though the standard seems to imply otherwise). Additionally, most shells don't directly call execvp() - they instead do their _own_ PATH lookup, and then use execl() or similar - if that fails with ENOEXEC, the shell itself can then immediately parse the file contents with the desired $0 already in place, without having to rely on execvp() to try to spawn yet another instance of sh for the purpose.

In playing with this, I note that the as-if rule might permit:

execl("/bin/sh", "sh", "-c", ". quoted_filename", arg0, arg1, ..., NULL)

where quoted_filename is created by quoting the original file in such a way that the shell sees the original name after processing quoting rules (so as not to open a security hole when file contains shell metacharacters) as roughly the same effect as execl("/bin/sh", arg0, file, arg1, ..., NULL) - in that it kicks off a shell invocation that executes commands from the given file while $0 is set to the original name. It additionally has the benefits that it will work on a system with busybox as /bin/sh (because busybox still sees "sh" as argv[0], but also has enough knowledge of what to store into $0 for the duration of sourcing the file). So I went ahead and included a mention of that in non-normative RATIONALE - but we may decide to drop that. Why? Because we took pains in 0000953 to clarify that the dot utility might parse a file as either a program or a compound_list, while the 'sh file arg1' form requires parsing as a program, so it might create an observable difference if this alternative fallback ends up parsing as a compound_list (or we might also decide to tweak the proposed normative text to allow for this difference in parsing). What's more, if musl is already complaining about injecting "/bin/sh" into argv as being hard to do safely given memory constraints after vfork( ), it will be even harder to argue in favor of creating the string ". quoted_filename", which requires even more memory.

In parallel with this, I'm planning to open a bug report against glibc to see if they will consider making the same change as FreeBSD did in 2020 of preserving arg0 to the eventual script. But they may reply that it risks breaking existing clients that have come to depend on the fallback passing $0 as a variant of "sh" rather than the original arg0, therefore my proposal here is to relax the requirements of the standard to allow more existing implementations to be rendered compliant as-is, even though it gives up the nice $0 guarantees.

I also wonder if the standard should consider adding support for 'exec -a arg0 cmd arg1...', which is another common implementation extension in many sh versions for setting argv[0] of the subsequent cmd. That belongs in a separate bug report, if at all. But by the as-if rule, an implementation with that extension might use execl("/bin/sh", "sh", "-c", "exec -a \"$0\" quoted_file \"$@\"", arg0, arg1, ..., NULL) as a way to execute the correct file with the desired $0 even if it can't use the proposed dot trick due to difference in parse scope.
line numbers from Issue 7 + TC2 (POSIX 2017), although the same text appears in draft 3 of issue 8.

At page 784 lines 26552-26557 (XSH exec DESCRIPTION), change:
...the executed command shall be as if the process invoked the sh utility using execl( ) as follows:
<tt>execl(<shell path>, arg0, file, arg1, ..., (char *)0);</tt>
where < shell path > is an unspecified pathname for the sh utility, file is the process image file, and for execvp( ), where arg0, arg1, and so on correspond to the values passed to execvp( ) in argv[0], argv[1], and so on.
to:
...the executed command shall be as if the process invoked the sh utility using execl( ) as follows:
<tt>execl(<shell path>, <name>, file, arg1, ..., (char *)0);</tt>
where < shell path > is an unspecified pathname for the sh utility, < name > is an unspecified process name, file is the process image file, and for execvp( ), where arg1, arg2, and so on correspond to the values passed to execvp( ) in argv[1], argv[2], and so on.


After page 794 line 26981 (XSH exec RATIONALE), add a new paragraph:
When execlp( ) or execvp( ) fall back to invoking sh because of an ENOEXEC condition, the standard leaves the process name (what becomes argv[0] in the resulting sh process) unspecified. Existing implementations vary on whether they pass a variation of "sh", or preserve the original arg0. There are existing implementations of sh that behave differently depending on the contents of argv[0], such that blindly passing the original arg0 on to the fallback execution can fail to invoke a compliant shell environment. An implementation may instead utilize <tt>execl(<shell name>, "sh", "-c", ". <quoted_file>", arg0, arg1, ..., NULL)</tt>, where quoted_file is created by escaping any characters special to the shell, as a way to expose the original $0 to the shell commands contained within file without breaking sh sensitive to the contents of argv[0].
Notes
(0006281)
geoffclare   
2023-05-11 16:05   
Interpretation response
------------------------
The standard states the value of arg0 to be passed to the sh utility, and conforming implementations must conform to this. However, concerns have been raised about this which are being referred to the sponsor.

Rationale:
-------------
The standard does not match some existing practice, and a different arg0 value is not observable by applications (without using extensions).

Notes to the Editor (not part of this interpretation):
-------------------------------------------------------
At page 784 lines 26552-26557 (XSH exec DESCRIPTION), change:
...the executed command shall be as if the process invoked the sh utility using execl( ) as follows:
<tt>execl(<shell path>, arg0, file, arg1, ..., (char *)0);</tt>
where <shell path> is an unspecified pathname for the sh utility, file is the process image file, and for execvp( ), where arg0, arg1, and so on correspond to the values passed to execvp( ) in argv[0], argv[1], and so on.
to:
...the executed command shall be as if the process invoked the sh utility using execl( ) as follows:
<tt>execl(<shell path>, <name>, file, <args>, (char *)0);</tt>
where <shell path> is an unspecified pathname for the sh utility, <name> is an unspecified string, file is the process image file, and where <args> is zero or more parameters corresponding to any initial non-null arguments passed after arg0 for execlp( ) or to any initial non-null members of argv starting at argv[1] for execvp( ).

After page 794 line 26981 (XSH exec RATIONALE), add a new paragraph:
When execlp( ) or execvp( ) fall back to invoking sh because of an ENOEXEC condition, the standard leaves the process name (what becomes argv[0] in the resulting sh process) unspecified. Existing implementations vary on whether they pass a variation of "sh", or preserve the original arg0. There are existing implementations of sh that behave differently depending on the contents of argv[0], such that blindly passing the original arg0 on to the fallback execution can fail to invoke a compliant shell environment. Because of the requirements on how sh handles its command line arguments, the shell script will see $0 containing the pathname of the script being executed, regardless of the value of argv[0].




Viewing Issue Advanced Details
1646 [Issue 8 drafts] System Interfaces Objection Omission 2023-03-22 20:44 2023-06-13 11:15
eblake
 
normal  
Applied  
Accepted As Marked  
   
Eric Blake
Red Hat
ebb.exec at_quick_exit
XSH exec
866
29540
Note: 0006299
exec*() misses reference to at_quick_exit()
Now that C17 pulled in at_quick_exit(), we need to add that to the list of handlers that are dropped upon successful execl() and friends.
At page 866 line 29540 (XSH exec DESCRIPTION), change:
After a successful call to any of the exec functions, any functions previously registered by the atexit( ) or pthread_atfork( ) functions are no longer registered.
to:
After a successful call to any of the exec functions, any functions previously registered by the atexit( ), at_quick_exit( ), or pthread_atfork( ) functions are no longer registered.
Notes
(0006299)
geoffclare   
2023-06-01 15:28   
At page 625 line 22114 (XSH at_quick_exit() DESCRIPTION), delete:
After a successful call to any of the exec functions, any functions previously registered by at_quick_exit() shall no longer be registered.

    
At page 636 line 22406 (XSH atexit() DESCRIPTION), delete:
After a successful call to any of the exec functions, any functions previously registered by atexit() shall no longer be registered.


At page 866 line 29540 (XSH exec DESCRIPTION), change:
After a successful call to any of the exec functions, any functions previously registered by the atexit( ) or pthread_atfork( ) functions are no longer registered.
to:
After a successful call to any of the exec functions, any functions previously registered by the atexit( ), at_quick_exit( ), or pthread_atfork( ) functions are no longer registered.




Viewing Issue Advanced Details
1647 [1003.1(2016/18)/Issue7+TC2] System Interfaces Objection Clarification Requested 2023-03-28 16:32 2023-08-22 14:22
eblake
 
normal  
Applied  
Accepted As Marked  
   
Eric Blake
Red Hat
ebb.printf %lc
fprintf
913
30957
Approved
Note: 0006239
printf("%lc", (wint_t)0) can't output NUL byte
In comparing a table of wide vs. narrow print operations, coupled with the NUL byte/character, we have the following surprising table of results:
narrow with narrow: printf("%c", '\0') -> 1 NUL byte
wide with wide: wprintf("%lc", L'\0') -> 1 NUL character
wide with narrow: wprintf("%c", '\0') -> 1 NUL character
narrow with wide: printf("%lc", L'\0') -> 0 bytes

Why? Because "If an l (ell) qualifier is present, the wint_t argument shall be converted as if by an ls conversion specification with no precision and an argument that points to a two-element array of type wchar_t, the first element of which contains the wint_t argument to the ls conversion specification and the second element contains a null wide character.", and printf("%ls", L"") outputs 0 bytes.

Even though ISO C has specified this for more than 23 years, it would make a lot more sense if 0 weren't special-cased as the one wide character you can't print to a narrow stream. Most libc have done the common-sense mapping, and only recently did we learn that musl differed from everyone else in actually obeying the literal requirements of C, leading to this glibc bug report: https://sourceware.org/bugzilla/show_bug.cgi?id=30257 [^]

Since these interfaces defer to the C standard unless explicitly stated otherwise, any change we do here will need to be coordinated with WG14. I recommend that the Austin Group start by filing a ballot defect report against the upcoming C23 recommending that narrow *printf %lc should behave like the other three combinations. At that point, even though Issue 8 will be tied to C17 which has the undesirable semantics, we can use <CX> shading to require POSIX to be in line with what C23 will land on. However, we should not start an interpretation request unless we know for sure how WG14 wants to proceed.
After coordination with WG14, and after applying the change to 0001643, change page 913 line 30957 (fprintf DESCRIPTION for <tt>%c</tt>) from:
If an <tt>l</tt> (ell) qualifier is present, the wint_t argument shall be converted as if by an <tt>ls</tt> conversion specification with no precision and an argument that points to a two-element array of type wchar_t, the first element of which contains the wint_t argument to the <tt>lc</tt> conversion specification and the second element contains a null wide character.
to:
If an <tt>l</tt> (ell) qualifier is present, <CX>the wint_t argument shall be converted to a multi-byte sequence as if by a call to wcrtomb( ) with the wint_t argument converted to wchar_t and an initial shift state, and the resulting bytes written.</CX>
Notes
(0006239)
eblake   
2023-03-30 16:33   
(edited on: 2023-04-03 15:28)
In addition to this interpretation response, the Austin Group plans to file a ballot defect on C23 to WG14.

Interpretation response
------------------------
The standard states that printf("%lc", (wint_t)0) writes no bytes, and conforming implementations must conform to this. However, concerns have been raised about this which are being referred to the sponsor.

Rationale:
-------------
The requirement to write no bytes does not match historical practice. However, the requirement derives from the ISO C standard and an attempt to change the requirements in a TC for Issue 7 would introduce a conflict. Therefore this will be addressed in Issue 8 by not deferring to the ISO C standard regarding this behavior.

Notes to the Editor (not part of this interpretation):
-------------------------------------------------------
Change page 909 line 30747 (fprintf DESCRIPTION) from:

    
Excluding dprintf(): The functionality described on this reference page is aligned with the ISO C standard. Any conflict between the requirements described here and the ISO C standard is unintentional. This volume of POSIX.1-2017 defers to the ISO C standard.


to:

    
Except for dprintf() and the behavior of the <tt>%lc</tt> conversion when passed a null wide character, the functionality described on this reference page is aligned with the ISO C standard. Any other conflict between the requirements described here and the ISO C standard is unintentional. This volume of POSIX.1-202x defers to the ISO C standard for all fprintf(), printf(), snprintf(), and sprintf() functionality except in relation to the <tt>%lc</tt> conversion when passed a null wide character.


    
Change page 913 line 30957 (fprintf DESCRIPTION for <tt>%c</tt>) from:
    
If an <tt>l</tt> (ell) qualifier is present, the wint_t argument shall be converted as if by an <tt>ls</tt> conversion specification with no precision and an argument that points to a two-element array of type wchar_t, the first element of which contains the wint_t argument to the <tt>ls</tt> conversion specification and the second element contains a null wide character.

to:
    
If an <tt>l</tt> (ell) qualifier is present, [CX]the wint_t argument shall be converted to a multi-byte sequence as if by a call to wcrtomb( ) with a pointer to storage of at least MB_CUR_MAX bytes, the wint_t argument converted to wchar_t, and an initial shift state, and the resulting byte(s) written.[/CX]


Add a paragraph to RATIONALE, page 920 line 31263:
    
The behavior specified for the <tt>%lc</tt> conversion differs slightly from the specification in the ISO C standard, in that printing the null wide character produces a null byte instead of 0 bytes of output as would be required by a strict reading of the ISO C standard's direction to behave as if applying the <tt>%ls</tt> specifier to a wchar_t array whose first element is the null wide character. Requiring a multibyte output for every possible wide character, including the null character, matches historical practice, and provides consistency with <tt>%c</tt> in fprintf( ) and with both <tt>%c</tt> and <tt>%lc</tt> in fwprintf( ). It is anticipated that a future edition of the ISO C standard will change to match the behavior specified here.






Viewing Issue Advanced Details
1648 [1003.1(2016/18)/Issue7+TC2] Shell and Utilities Objection Clarification Requested 2023-03-30 09:34 2023-06-13 10:56
geoffclare
 
normal  
Applied  
Accepted  
   
Geoff Clare
The Open Group
1.4 Utility Description Defaults
2339
74433-74444
---
Confusing description of ASYNCHRONOUS EVENTS default behaviour
The default behaviour described for signal handling in the ASYNCHRONOUS EVENTS section of 1.4 Utility Description Defaults is highly confusing because the word "default" is doing multiple jobs.

It also has a list of three items that at first sight seem inter-related, but the list is introduced with "... shall be one of the following", indicating that they are independent choices. By a strict reading, this gives implementations the freedom to do unexpected things (like be terminated by a signal that was inherited as ignored).

In addition, the second list item appears to be redundant as it includes the condition "when no action has been taken to change the default". Since this section is describing the default behaviour, the condition is always true and thus the second item is effectively just a repeat of the first.

Finally, the last paragraph of this section should be conditional on cases where the signal terminates the utility.
Change:
... the action taken as a result of the signal shall be one of the following:
  1. The action shall be that inherited from the parent according to the rules of inheritance of signal actions defined in the System Interfaces volume of POSIX.1-2017.

  2. When no action has been taken to change the default, the default action shall be that specified by the System Interfaces volume of POSIX.1-2017.

  3. The result of the utility's execution is as if default actions had been taken.
A utility is permitted to catch a signal, perform some additional processing (such as deleting temporary files), restore the default signal action (or action inherited from the parent process), and resignal itself.
to:
... the action taken as a result of the signal shall be as follows:
  • If the action inherited from the invoking process, according to to the rules of inheritance of signal actions defined in the System Interfaces volume of POSIX.1-2017, is for the signal to be ignored, the utility shall ignore the signal.

  • If the action inherited from the invoking process, according to to the rules of inheritance of signal actions defined in the System Interfaces volume of POSIX.1-2017, is the default signal action, the result of the utility's execution shall be as if the default signal action had been taken.
When the required action is for the signal to terminate the utility, the utility may catch the signal, perform some additional processing (such as deleting temporary files), restore the default signal action, and resignal itself.

Notes




Viewing Issue Advanced Details
1649 [Issue 8 drafts] Shell and Utilities Objection Error 2023-03-31 01:55 2023-10-10 09:31
kre
 
normal  
Applied  
Accepted As Marked  
   
Robert Elz
XCU 2.6.5
2476
80478 - 80504
See Note: 0006488.
Field splitting is woefully under specified, and in places, simply wrong
I didn't really believe this when it was pointed out on a mailing list,
but nowhere in XCU 2.6,5 (Field Splitting) does it say what happens when
the expansion being split is not empty but contains no IFS characters.

Further, or perhaps as a more general case of the above, it also doesn't
say what happens to any characters that follow the last IFS character in
the expansion being split - nothing gets delimited in that case, as there
is no delimiter to accomplish that.

Note that in XCU 2.6.5 bullet point 1, the example contains a trailing
<space> which the text says "shall be ignored" - but is still present in
the example, and so could be read as delimiting the "bar" field (before
being ignored). The text there is ambiguous, as it doesn't specify the
ordering of the "shall be ignored" wrt the "shall delimit a field". We
could delimit the field first, and then ignore the trailing IFS white space.

It also doesn't explicitly say what happens to the results - it is clear
that multiple fields can result, and that zero fields can result, but it
doesn't say how this applies to a case like prefix${VAR}suffix in the
various cases.

It turns out there are a number of other errors, or differences from
the way that shells actually behave, in the text as well.

Lastly, as best I can tell, the operation "delimited" isn't actually
defined anywhere. It is used in this section (also in XCU 2.3 (Token
Recognition) and perhaps other places) but I cannot locate a definition.

One might take the second paragraph of XCU 2.6.5 (lines 80481-80485) as
a kind of definition, at least for the purposes of this section, but it
isn't really explicit that it intends to be that.

I don't think there is any disagreement what should happen in any of these
cases (except perhaps one), which is perhaps why the text which says what
that is is missing. We all simply know what happens, and assume that.

There's another issue - shells do not agree as to what constitutes IFS white
space. XBD 3.412 defines "white space" (in the POSIX locale) to include
carriage return, vertical tab, and form feed, and the standard (XBD 2.9.5
bullet point 3 says "any of the white space characters in the IFS value".
bash, yash and ksh93 allow those "extra" three white space characters to be
IFS white space - other shells do not, only space/tab/newline are considered
as candidates for being IFS white space (including mksh and bosh).
I would presume (but have no easy way to test) that if a locale defined
some other characters as being space characters (as in "isspace() returns
true"), then bash, yash, and ksh93 would treat those chars as white space
as well. Something needs to be done to handle that difference (not that
I have ever seen any real code using any of \r \v or \f as an IFS character).

Lastly, while I am here, one other (not directly related) issue that should
be cleared up ...

   The shell shall treat a byte sequence forming any of the characters
   in the IFS value

doesn't say what character encoding is intended to be used for this purpose.
Clearly it is set by LC_CTYPE (I hope) - but which version of that env var?
The value that LC_CTYPE had when IFS was assigned a value (which would mean
LC_CTYPE=C (or POSIX, or unset) I assume for the default $' \t\n' for IFS.
Or are we supposed to use the value of LC_CTYPE at the time IFS is being used,
which would mean that changing LC_CTYPE might have the side effect of altering
the meanings of the field separators (terminators) to be used by field
splitting.

  [Aside: in the previous paragraph, I use "LC_CTYPE" as a shorthand to
   refer to the locale's character encoding settings, however that is
   communicated to the shell - via LANG LC_CTYPE LC_ALL or some other
   way - this issue isn't about locale definitions/uses so none of those
   differences are relevant here.]

To help make sure that the rules that we end up specifying match what
is actually implemented. I wrote a little test script. It and its (normal)
output are reproduced below. All (Bourne compatible) shells, with two
exceptions, produce identical output. One of the exceptions is the ancient
implementation of pdksh which masquerades as /bin/ksh on NetBSD - that thing
is full of bugs, and the (wrong) output here is just one of many examples.
That one can simply be ignored. The other is mksh which produces different
results for the SCSCS and S5 tests (its output will be shown below). Note
here that "all shells" means those I have to test (which excludes ksh88, and
the original Bourne Shell and its very close relatives - but includes bosh)
but does include zsh (as long as it is run in "--emulate sh" mode - in its
default mode it is "different").

========================================== the test script (also attached)

argc() {
   printf '%s:\t' "$1"; shift
   printf "%2d args:" "$#"
   printf " <%s>" "$@"
   printf '\n'
}
   
# you will probably need to edit the following line, it will not survive mantis
IFS=' ,' # IFS=$' \t,' except not all shells have $'' yet

SPACE=' '
FOO=foo
TWO='one two'
TWOS=' one two '
C=','
C2=',,'
CSC=', ,'
SCSCS=' , , '
S1='one,two'
S2='one , two'
S3=',one,two'
S4='one,two,'
S5=' ,one ,two, '

argc foo foo${FOO}foo
argc sep foo${SPACE}foo
argc two foo${TWO}foo
argc twos foo${TWOS}foo
argc 2two foo${TWO}${TWO}foo
argc 2twos foo${TWOS}${TWOS}foo
argc comma foo${C}foo
argc C2 foo${C2}foo
argc CSC foo${CSC}foo
argc SCSCS foo${SCSCS}foo
argc S1 foo${S1}foo
argc S2 foo${S2}foo
argc S3 foo${S3}foo
argc S4 foo${S4}foo
argc S5 foo${S5}foo
========================================== the test script ends, results follow
foo: 1 args: <foofoofoo>
sep: 2 args: <foo> <foo>
two: 2 args: <fooone> <twofoo>
twos: 4 args: <foo> <one> <two> <foo>
2two: 3 args: <fooone> <twoone> <twofoo>
2twos: 6 args: <foo> <one> <two> <one> <two> <foo>
comma: 2 args: <foo> <foo>
C2: 3 args: <foo> <> <foo>
CSC: 3 args: <foo> <> <foo>
SCSCS: 3 args: <foo> <> <foo>
S1: 2 args: <fooone> <twofoo>
S2: 2 args: <fooone> <twofoo>
S3: 3 args: <foo> <one> <twofoo>
S4: 3 args: <fooone> <two> <foo>
S5: 4 args: <foo> <one> <two> <foo>
========================================== results end

For SCSCS mksh produces

SCSCS: 4 args: <foo> <> <> <foo>

For S5 mksh produces:

S5: 5 args: <foo> <> <one> <two> <foo>

THat's using version MIRBSD KSH R59 2020/05/16
(aka R59b). I know that R59c exists, but it hasn't been upgraded
in NetBSD's pkgsrc yet... Further, the changelog doesn't indicate
any changes in this area, either in R59c nor in the so far unreleased
version as it is now (or not that I noticed).

These differences are both examples of the same omission from
the standard I believe:

   Each occurrence in the input of a byte sequence that forms an IFS
   character that is not IFS white space, along with any adjacent IFS
   white space, shall delimit a field, as described previously.

In the case in S5, the input starts " ," where the " " is
IFS white space, and the ',' is an IFS character that is not IFS
white space, so that should delimit a field. That's what mksh
does. But isn't what anything else does, and isn't (I believe) the
intended behaviour. The "as described previously" (which would
be clearer if it said precisely where it was previously described)

   If no fields are delimited, for example if the input is empty or
   consists entirely of IFS white space, the result shall be zero
   fields (rather than an empty field).

if I think intending to say that if there has been nothing, when
a field is delimited and haven't delimited one previously, then
nothing results, rather than an empty field. But it doesn't say
that exactly. If that isn't the "as described previously" then
I have no idea what is.

The results show other discrepancies (in all shells) from what the
standard seems to require.

Eg: the "sep" test contains an expansion that is entirely IFS white
space, that IFS white space is at the beginning (and end) of the
input, so according to 2.6.5 3.a that IFS white space should be
ignored. To me "ignored" should mean "treated as if it is missing".

If that were true the result of the "sep" test would be a single
field "foofoo" (just as it is "foofoofoo" in the foo test where the
input contains no IFS characters at all). But it isn't, it is two
fields - the expansion contributes no data to the result, but it does
serve to separate the prefix and suffix in the prefix${VAR}suffix
case - the "twos" "S3" and "S4" tests show that any leading/trailing
white space in the input serve to separate any field produced by the
expansion being field split from the prefix or suffix (as applicable).

This whole section needs (yet another) complete rewrite.
When doing that bullet point 1 should simply be dropped, it says nothing
that bullet point 3 doesn't also say (except the example, which ought to
be expanded to more than one example, after the normative text).
I am working on new wording, which I will append as a note when I have
something suitable. In the meantime, everyone else, feel free to make
suggestions. This is all a mess - and it should be simple.
ifs (1 KB) 2023-03-31 01:55
IFS-test (3 KB) 2023-09-07 15:06
POSIX-bug-1649-impl.sh (8 KB) 2023-09-07 15:07
Expected-Results (2 KB) 2023-09-07 15:09
Revised-bug-1649-suggestion (6 KB) 2023-09-11 03:59
Notes
(0006488)
Don Cragun   
2023-09-25 15:43   
(edited on: 2023-10-02 17:49)
Replace XCU section 2.6.5 on issue 8 draft 3 P2476, L80478-80504 with:
After parameter expansion (Section 2.6.2), command substitution (Section 2.6.3), and arithmetic expansion (Section 2.6.4), and if the shell variable IFS [xref XCU 2.5.3] is set and its value is not empty, or if the IFS variable is unset, the shell shall scan each field containing results of expansions and substitutions that did not occur in double-quotes for field splitting; zero, one or multiple fields can result.

For the remainder of this section, any reference to the results of an expansion, or results of expansions, shall be interpreted to mean the results from one or more unquoted variable or arithmetic expansions, or unquoted command substitutions.

If the IFS variable is set and has an empty string as its value, no field splitting occurs. However if an input field which contained the results of an expansion is entirely empty, it shall be removed. Note that this occurs before quote removal; any input field that contains any quoting characters can never be empty at this point. After the removal of any such fields from the input, the possibly modified input field list becomes the output.

Each input field is considered in sequence, first to last, with the results of the algorithm described in this section causing output fields to be generated, which remain in the same order as the input fields from which they originated.

Fields which contain no results from expansions shall not be affected by field splitting, and shall remain unaltered, simply moving from the list of input fields to be next in the list of output fields.

In the remainder of this description, it is assumed that there is present in the field at least one expansion result; this assumption will not be restated. Field splitting only ever alters those parts of the field.

For the purposes of this section, the term "IFS white space" shall mean any of the white-space bytes [xref to XBD 3.412, 3.413, and 3.414] <space>, <tab> or <newline> from the Portable Character Set [xref XBD 6.1] which are present in the value of the IFS variable, and perhaps other white-space characters. It is implementation defined whether other white-space characters which appear in the value of IFS are also considered as "IFS white space". The three characters above specified as IFS white-space bytes are always IFS white space, when they occur in the value of IFS, regardless of whether they are white-space characters in any relevant locale. For other locale specific white-space characters allowed by the implementation it is unspecified whether the character is considered as IFS white space if it is white space at the time it is assigned to the IFS variable, or if it is white space at the time field splitting occurs (the locale may have changed between those events).

If the IFS variable is unset, then for the purposes of this section, but without altering the value of the variable, its value shall be considered to contain the three single byte characters <space>, <tab> and <newline> from the portable character set, all of which are IFS white-space characters.

The shell shall use the byte sequences that form the characters in the value of the IFS variable as delimiters. Each of the characters <space> <tab> and <newline> which appears in the value of IFS shall be a single byte delimiter. The shell shall use these delimiters as field terminators to split the results of expansions, along with other adjacent bytes, into separate fields, as described below. Note that these delimiters terminate a field; they do not, of themselves, cause a new field to start, subsequent bytes that are not from the results of an expansion, or that do not form IFS white-space characters are required for a new field to begin.

Note that the shell processes arbitrary bytes from the input fields; there is no requirement that those bytes form valid characters.

If results of the algorithm are that no fields are delimited, that is, if the input field is wholly empty or consists entirely of IFS white space, the result shall be zero fields (rather than an empty field).

For the purposes of this section, when a field is said to be delimited, then the candidate field, as generated below shall become an output field. When the algorithm transforms a candidate into an output field it shall be appended to the current list of output fields.

Each field containing the results from an expansion shall be processed in order, intermixed with fields not containing the results of expansions, processed as described above, as if as follows, examining bytes in the input field, from beginning to end:
Begin with an empty candidate field and the input as specified above.

When instructed to start the next iteration of the loop, this is the start of the loop. While the input (as modified by earlier iterations of this loop) is not empty:
Consider the leading remaining byte or byte sequence of the input. No such byte sequence shall contain data such that some bytes in the sequence resulted from an expansion, and others did not, or which contains bytes resulting from the results of more than one expansion. If the byte or sequence of bytes is:
  1. A byte (or sequence of bytes) in the input that did not result from an expansion:
    Append this byte (or sequence) to the candidate, and remove it from the input. Start the next iteration of the loop.
  2. A byte sequence in the input which resulted from an expansion that does not form a character in IFS:
    Append the first byte of the sequence to the candidate, and remove that byte from the input. Start the next iteration of the loop.
  3. A byte sequence in the input which resulted from an expansion that forms an IFS white space character:
    Remove that byte sequence from the input, consider the new leading input byte sequence, and repeat this step.
  4. A byte sequence in the input that resulted from an expansion that forms an IFS character, which is not IFS white space:
    Remove that byte sequence from the input, but note it was observed.
At this point, if the candidate is not empty, or if a sequence of bytes representing an IFS character that is not IFS white space was seen at step 4, then a field is said to have been delimited, and the candidate becomes an output field.

Empty (clear) the candidate, and start the next iteration of the loop.
Once the input is empty, the candidate becomes an output field if and only if it is not empty.
The ordered list of output fields so produced, which may be empty, replaces the list of input fields.






Viewing Issue Advanced Details
1650 [Issue 8 drafts] Shell and Utilities Editorial Clarification Requested 2023-03-31 08:46 2023-06-13 11:18
rillig
 
normal  
Applied  
Accepted As Marked  
   
Roland Illig
make
3103-3136
See Note: 0006301.
Words 'prerequisite' and 'dependency' used interchangeably
In the description of the make utility, the words 'prerequisite' and 'dependency' are used interchangeably. Since these words mean the same, there's no need to use two separate words for it.
Replace all occurrences of 'dependency' or 'dependencies' with the corresponding form of 'prerequisite'.
Maybe reword the occurrences of 'depend'.
Notes
(0006301)
Don Cragun   
2023-06-01 16:04   
(edited on: 2023-06-01 16:07)
On page 3107 line 104641 section make, change:
A target shall be considered up-to-date if it exists and is newer than all of its dependencies
to:
A target shall be considered up-to-date if it exists and is newer than all of its prerequisites



On page 3111 line 104824 section make, change:
When source files are named in a dependency list, make treats them just like any other target. Because the source file is presumed to be present in the directory, there is no need to add an entry for it to the makefile. When a target has no dependencies, but is present in the directory, make assumes that that file is up-to-date.
to:
When source files are named in a list of prerequisites, make treats them just like any other target. Because the source file is presumed to be present in the directory, there is no need to add an entry for it to the makefile. When a target has no prerequisites, but is present in the directory, make assumes that that file is up-to-date.



On page 3116 line 105066 section make, change:
Dependencies added by target rules without commands
to:
Prerequisites added by target rules without commands






Viewing Issue Advanced Details
1652 [Issue 8 drafts] Shell and Utilities Editorial Clarification Requested 2023-03-31 08:53 2023-06-13 11:20
rillig
 
normal  
Applied  
Accepted  
   
Roland Illig
make
3103
104488
make: missing option argument
The -j option is missing its argument 'maxjobs'.
At the beginning of line 104488, replace '-j' with '-j maxjobs'.
Notes




Viewing Issue Advanced Details
1653 [Issue 8 drafts] Shell and Utilities Editorial Clarification Requested 2023-03-31 09:05 2023-06-13 11:21
rillig
 
normal  
Applied  
Accepted As Marked  
   
Roland Illig
make
3106
104597
Note: 0006302
make: confusing reference to word expansion
The current specification says:
> The difference
> between the contents of MAKEFLAGS and the make utility command line is
> that the contents of the variable shall not be subjected to the word expansions
> (see Section 2.6, on page 2468) associated with parsing the command line
> values.

The reference points to the shell command language, even though there is no shell involved at this point.

The text "parsing the command line values" is not related to word expansion, as the "command line values" are interpreted as separate strings, typically by getopt.
Remove the confusing sentence.

Alternatively, reword the sentence to not refer to the shell.
Notes
(0006302)
geoffclare   
2023-06-01 16:30   
Change:
The characters are formatted in a manner similar to a portion of the make utility command line: options are preceded by <hyphen-minus> characters and <blank>-separated as described in XBD Section 12.2 (on page 215). The macro=value macro definition operands can also be included. The difference between the contents of MAKEFLAGS and the make utility command line is that the contents of the variable shall not be subjected to the word expansions (see Section 2.6, on page 2468) associated with parsing the command line values.
to:
The characters are formatted in a manner similar to the use of the make utility in shell commands: options are preceded by <hyphen-minus> characters and <blank>-separated as described in XBD Section 12.2 (on page 215). The macro=value macro definition operands can also be included. The difference between the contents of MAKEFLAGS and the use of the make utility in shell commands is that the contents of the variable shall not be subjected to the word expansions (see Section 2.6, on page 2468) associated with parsing shell command lines.




Viewing Issue Advanced Details
1654 [Issue 8 drafts] Shell and Utilities Editorial Clarification Requested 2023-03-31 09:09 2023-06-27 14:45
rillig
 
normal  
Applied  
Accepted As Marked  
   
Roland Illig
make
3121
105276
Note: 0006306
make: wrong quotes in example
The line says:
> CFLAGS = "-D COMMENT_CHAR='#'"

The double quotes are not needed.
Remove the double quotes.
Notes
(0006306)
geoffclare   
2023-06-05 09:43   
I assume the intention is that the single quotes are passed to the compiler (i.e. the value of the COMMENT_CHAR macro is a C character constant). In which case the double quotes are needed but the first one is in the wrong place.

Change:
CFLAGS = "-D COMMENT_CHAR='#'"
to:
CFLAGS = -D "COMMENT_CHAR='#'"




Viewing Issue Advanced Details
1655 [Issue 8 drafts] Shell and Utilities Editorial Clarification Requested 2023-03-31 09:13 2023-06-27 14:47
rillig
 
normal  
Applied  
Accepted  
   
Roland Illig
make
3121
105280
make: wrong plural form
> The standard set of default rules use only features
Replace 'use' with 'uses'.
There are no notes attached to this issue.




Viewing Issue Advanced Details
1656 [Issue 8 drafts] Shell and Utilities Editorial Clarification Requested 2023-03-31 09:28 2023-06-27 14:48
rillig
 
normal  
Applied  
Accepted  
   
Roland Illig
make
3103
104451
make: too small scope in one-line summary
> make — maintain, update, and regenerate groups of programs

The wording "groups of programs" unnecessarily restricts the scope of the make utility.
Replace 'groups of programs' with 'files'.
There are no notes attached to this issue.




Viewing Issue Advanced Details
1657 [Issue 8 drafts] Shell and Utilities Editorial Clarification Requested 2023-03-31 09:31 2023-06-27 14:51
rillig
 
normal  
Applied  
Accepted As Marked  
   
Roland Illig
make
3107
104671
Note: 0006311
make: section 'Makefile Syntax' contains unrelated requirements
> By default, the following files shall be tried in sequence: ./makefile and ./Makefile.

This requirement is not about the syntax of makefiles, it's about their names.
Move the requirements on the names of makefiles to a separate section.

Alternatively, reword the section heading in line 104660.
Notes
(0006311)
geoffclare   
2023-06-05 16:01   
Change page 3107 line 104640:
The make utility attempts to perform the actions required to ensure that the specified targets are up-to-date. A target shall be considered up-to-date ...
to:
The make utility attempts to perform the actions, specified in one or more makefiles, required to ensure that specified targets are up-to-date. By default, the following files shall be tried in sequence: ./makefile and ./Makefile. If neither ./makefile nor ./Makefile is found, other implementation-defined files may also be tried. [XSI]On XSI-conformant systems, the additional files ./s.makefile, SCCS/s.makefile, ./s.Makefile, and SCCS/s.Makefile shall also be tried.[/XSI] The −f option shall direct make to ignore any of these default files and use the specified option-argument as a makefile instead. If this option-argument is '-', standard input shall be used.

The term makefile is used to refer to any rules provided by the user, whether in ./makefile or its variants, or specified by the −f option.

A target shall be considered up-to-date ...

and delete corresponding text in Makefile Syntax, pages 3107-3108 lines 104671-104678.




Viewing Issue Advanced Details
1658 [Issue 8 drafts] Shell and Utilities Editorial Clarification Requested 2023-03-31 09:40 2023-06-27 14:56
rillig
 
normal  
Applied  
Accepted As Marked  
   
Roland Illig
(several)
(several)
(several)
Note: 0006244
spell 'white space' consistently
Draft 3 contains 5 instances of the word 'whitespace' and 58 instances of the word 'white space'.

For consistency, only a single term should be used.
Replace 'whitespace' with 'white space'.
Notes
(0006244)
geoffclare   
2023-04-03 10:02   
(edited on: 2023-06-05 16:22)
On page 3066 line 102981 section m4, and
page 3066 line 102983 section m4, change:
whitespace characters
to:
white-space characters


On page 3112 line 104875 section make, change:
all whitespace at
to:
all white space at


On page 3114 line 104946 section make, change:
each whitespace-separated word
to:
each white-space-separated word


(I can only find the above four uses of "whitespace"; if there really is a fifth, please give the page and line number.)





Viewing Issue Advanced Details
1660 [Issue 8 drafts] Shell and Utilities Comment Error 2023-04-06 08:44 2023-06-27 14:58
geoffclare
 
normal  
Applied  
Accepted  
   
Geoff Clare
The Open Group
make
3128
105558
Out of date make rationale about -n and $(MAKE)
Bug 0001436 added the requirement that command lines which expand the MAKE macro are still executed when -n is used, and removed some old rationale about this feature, but missed some later rationale that should also have been either removed or changed.
Change:
However, the System V convention of forcing command execution with -n when the command line of a target contains either of the strings "$(MAKE)" or "${MAKE}" has not been adopted. This functionality appeared in early proposals, but the danger of this approach was pointed out with the following example of a portion of a makefile:
subdir:
cd subdir; rm all_the_files; $(MAKE)
The loss of the System V behavior in this case is well-balanced by the safety afforded to other makefiles that were not aware of this situation. In any event, the command line <plus-sign> prefix can provide the desired functionality.
to:
The System V convention of forcing command execution with -n when the command line of a target expands the MAKE macro was not adopted in earlier versions of this standard, but it is now required because it has become widespread existing practice.

There are no notes attached to this issue.




Viewing Issue Advanced Details
1661 [Issue 8 drafts] Shell and Utilities Editorial Clarification Requested 2023-04-06 08:57 2023-06-27 15:00
geoffclare
 
normal  
Applied  
Accepted  
   
Geoff Clare
The Open Group
make
3111
104848
Description of .WAIT needs "if any" additions
Paul Smith pointed out on the mailing list on Sept 12, 2022 that the description of .WAIT added by bug 0001437 assumes the .WAIT has at least one prerequisite both left and right of it.
Change:
the prerequisites to the right of the .WAIT until the prerequisites to the left of it
to:
the prerequisites (if any) to the right of the .WAIT until the prerequisites (if any) to the left of it

There are no notes attached to this issue.




Viewing Issue Advanced Details
1662 [Issue 8 drafts] Shell and Utilities Objection Error 2023-04-11 14:28 2023-06-27 15:08
geoffclare
 
normal  
Applied  
Accepted  
   
Geoff Clare
The Open Group
ed, ex
2798, 2802, 2805, 2837, 2846, 2854, 2855
92748, 92937, 92954, 93049, 93054, 94315, 94669, 94995, 95009, 95022
Delimiter issues in ed and ex
The sed delimiter issues from bugs 0001550 and 0001551 also affect ed and ex.

With: g.a\.b.p

the "\." is treated as a literal "." in all versions of ed I tried and in all versions of ex/vi I tried except nvi where the "." is treated as special.

With: g.a[.]b.p

the "." in the bracket expression is not a delimiter in all versions of ed I tried and in all versions of ex/vi I tried except nvi where it is a delimiter (producing "brackets ([ ]) not balanced").

With: s.a\.b.x.

the "\." is treated as a literal "." in all versions of ed I tried and in all versions of ex/vi I tried except nvi where the "." is treated as special.

With: s.a[.]b.x.

the "." in the bracket expression is not a delimiter in all versions of ed I tried and in all versions of ex/vi I tried except nvi where it is a delimiter.

With: s&a&x\&y&

the "\&" is treated as a literal "&" in all versions of ed I tried and in all versions of ex/vi I tried except nvi where the "&" is treated as special.

The nvi behaviour in all these cases is likely a bug in nvi, as it is intended to behave exactly the same as the original vi (except for added new features).

The proposed changes are adapted from the resolution of bug 0001550 but requiring one behaviour where for sed it is unspecified which of two behaviours occurs, on the assumption that the behaviour of nvi should be considered to be a bug; if we want to allow it, something closer to the new sed text will be needed.
After page 2798 line 92748 section ed (Regular Expressions in ed), add a new paragraph:
The start and end of a regular expression (RE) are marked by a delimiter character (although in some circumstances the end delimiter can be omitted). In addresses, the delimiter is either <slash> or <question-mark>. In commands, other characters can be used as the delimiter, as specified in the description of the command. Within the RE (as an ed extension to the BRE syntax), the delimiter shall not terminate the RE if it is the second character of an escape sequence (see [xref to XBD 9.1]) and the escaped delimiter shall be treated as that literal character in the RE (losing any special meaning it would have had if it was not used as the delimiter and was not escaped). In addition, the delimiter character shall not terminate the RE when it appears within a bracket expression, and shall have its normal meaning in the bracket expression. For example, the command "g%[%]%p" is equivalent to "g/[%]/p", and the command "s-[0-9]--g" is equivalent to "s/[0-9]//g".

On page 2802 line 92937 section ed (g command), change:
Any character other than <space> or <newline> can be used instead of a <slash> to delimit the RE. Within the RE, the RE delimiter itself can be used as a literal character if it is preceded by a <backslash>.
to:
Any character other than <backslash>, <space>, or <newline> can be used instead of a <slash> to delimit the RE. Within the RE, in certain circumstances the RE delimiter can be used as a literal character; see [xref to Regular Expressions in ed].

On page 2802 line 92954 section ed (G command), change:
Any character other than <space> or <newline> can be used instead of a <slash> to delimit the RE and the replacement. Within the RE, the RE delimiter itself can be used as a literal character if it is preceded by a <backslash>.
to:
Any character other than <backslash>, <space>, or <newline> can be used instead of a <slash> to delimit the RE. Within the RE, in certain circumstances the RE delimiter can be used as a literal character; see [xref to Regular Expressions in ed].

On page 2805 line 93049 section ed (s command), change:
Any character other than <space> or <newline> can be used instead of a <slash> to delimit the RE and the replacement. Within the RE, the RE delimiter itself can be used as a literal character if it is preceded by a <backslash>.
to:
Any character other than <backslash>, <space>, or <newline> can be used instead of a <slash> to delimit the RE and the replacement. Within the RE, in certain circumstances the RE delimiter can be used as a literal character; see [xref to Regular Expressions in ed]. Within the replacement, the delimiter shall not terminate the replacement if it is the second character of an escape sequence (see [xref to XBD 9.1]) and the escaped delimiter shall be treated as that literal character in the replacement (losing any special meaning it would have had if it was not used as the delimiter and was not escaped).

On page 2805 line 93054 section ed (s command), change:
An <ampersand> ('&') appearing in the replacement shall be replaced by the string matching the RE on the current line. The special meaning of '&' in this context can be suppressed by preceding it by <backslash>. As a more general feature, the characters '\n', where n is a digit, shall be replaced by the text matched by the corresponding back-reference expression. If the corresponding back-reference expression does not match, then the characters '\n' shall be replaced by the empty string. When the character '%' is the only character in the replacement, the replacement used in the most recent substitute command shall be used as the replacement in the current substitute command; if there was no previous substitute command, the use of '%' in this manner shall be an error. The '%' shall lose its special meaning when it is in a replacement string of more than one character or is preceded by a <backslash>. For each <backslash> encountered in scanning replacement from beginning to end, the following character shall lose its special meaning (if any). It is unspecified what special meaning is given to any character other than <backslash>, '&', '%', or digits.
to:
An unescaped <ampersand> ('&') appearing in the replacement shall be replaced by the string matching the RE on the current line. As a more general feature, the characters '\n', where the <backslash> is unescaped and n is a digit, shall be replaced by the text matched by the corresponding back-reference expression. If the corresponding back-reference expression does not match, then the characters '\n' shall be replaced by the empty string. When the character '%' is the only character in replacement, the replacement used in the most recent substitute command shall be used as replacement in the current substitute command; if there was no previous substitute command, the use of '%' in this manner shall be an error. The '%' shall lose its special meaning when it is in a replacement string of more than one character or is escaped. It is unspecified what special meaning is given to any character other than <backslash>, '&', '%', or digits.

After page 2854 line 94995 section ex (Regular Expressions in ex), add a new paragraph:
The start and end of a regular expression (RE) are marked by a delimiter character (although in some circumstances the end delimiter can be omitted). In addresses, the delimiter is either <slash> or <question-mark>. In commands, other characters can be used as the delimiter, as specified in the description of the command. Within the RE (as an ex extension to the BRE syntax), the delimiter shall not terminate the RE if it is the second character of an escape sequence (see [xref to XBD 9.1]) and the escaped delimiter shall be treated as that literal character in the RE (losing any special meaning it would have had if it was not used as the delimiter and was not escaped). In addition, the delimiter character shall not terminate the RE when it appears within a bracket expression, and shall have its normal meaning in the bracket expression. For example, the command "g%[%]%p" is equivalent to "g/[%]/p", and the command "s-[0-9]--g" is equivalent to "s/[0-9]//g".

After page 2855 line 95009 section ex (Replacement Strings in ex), add a new paragraph:
Certain characters and strings have special meaning in replacement strings when the character, or the first character of the string, is unescaped.

On page 2855 line 95022 section ex (Replacement Strings in ex), change:
Otherwise, any character following a <backslash> shall be treated ...
to:
Otherwise, any character following an unescaped <backslash> shall be treated ...

On page 2837 line 94315 section ex (g command), after:
The pattern can be delimited by <slash> characters (shown in the Synopsis), as well as any non-alphanumeric or non-<blank> other than <backslash>, <vertical-line>, <newline>, or double-quote.
add:
Within the pattern, in certain circumstances the delimiter can be used as a literal character; see [xref to Regular Expressions in ex].

On page 2846 line 94669 section ex (s command), change:
Any non-alphabetic, non-<blank> delimiter other than <backslash>, '|', <newline>, or double-quote can be used instead of '/'. <backslash> characters can be used to escape delimiters, <backslash> characters, and other special characters.
to:
Any non-alphabetic, non-<blank> delimiter other than <backslash>, '|', <newline>, or double-quote can be used instead of '/'. Within the pattern, in certain circumstances the delimiter can be used as a literal character; see [xref to Regular Expressions in ex]. Within the replacement, the delimiter shall not terminate the replacement if it is the second character of an escape sequence (see [xref to XBD 9.1]) and the escaped delimiter shall be treated as that literal character in the replacement (losing any special meaning it would have had if it was not used as the delimiter and was not escaped).

There are no notes attached to this issue.




Viewing Issue Advanced Details
1663 [Issue 8 drafts] System Interfaces Comment Enhancement Request 2023-04-11 15:02 2023-06-27 15:12
geoffclare
 
normal  
Applied  
Accepted  
   
Geoff Clare
The Open Group
realpath()
1861
61474
Remove XSI shading from realpath()
Since the realpath utility is being added in Issue 8 as a mandatory utility (no XSI shading), for consistency the realpath() function should be moved from the XSI option to the Base.
On page 382 line 13463 section <stdlib.h> change XSI shading on realpath() to CX shading.
(This will produce three consecutive CX lines which should be combined into a block.)

On page 1861 line 61474 section realpath() remove XSI shading from SYNOPSIS.

On page 3299 line 112187 section realpath, delete the paragraph:
Although the behavior of the realpath utility is specified by reference to the realpath() function, which is part of the XSI option, non-XSI implementations that do not support realpath() are nevertheless required to implement realpath in accordance with the requirements described in this standard for realpath().

On page 3911 line 135943 section E.1 add realpath() to POSIX_SYMBOLIC_LINKS

On page 3912 line 135992 section E.1 remove realpath() from XSI_FILE_SYSTEM
There are no notes attached to this issue.




Viewing Issue Advanced Details
1664 [Issue 8 drafts] System Interfaces Comment Error 2023-04-11 15:37 2023-06-27 15:15
geoffclare
 
normal  
Applied  
Accepted  
   
Geoff Clare
The Open Group
readdir()
1850, 1851
61088, 61117
Some readdir() non-normative text needs updating to reflect normative text changes
The resolution of bug 0000696 missed an APPLICATION USAGE change, and the RATIONALE change could be improved.
On page 1850 line 61088 section readdir(), change:
The readdir_r() function is thread-safe and shall return values in a user-supplied buffer instead of possibly using a static data area that may be overwritten by each call.
to:
The readdir_r() function returns values in a user-supplied buffer, but does not allow the size of the buffer to be specified by the caller. If {NAME_MAX} is indeterminate, there is no way for an application to know how large the buffer needs to be and readdir_r() cannot safely be used.

On page 1851 line 61117 section readdir(), change:
The readdir_r() function returns values in a user-supplied buffer instead of possibly using a static data area that may be overwritten by each call. Either the {NAME_MAX} compile-time constant or the corresponding pathconf() option can be used to determine the maximum sizes of returned pathnames. However, since the size of a filename has no limit on some filesystem types, there is no way to reliably allocate a buffer large enough to hold a filename being returned by readdir_r(). Therefore, readdir_r() has been marked obsolescent and readdir() is now required to be thread safe as long as there are no concurrent calls to it on a single directory stream.
to:
Historically, readdir() returned a pointer to an internal static buffer that was overwritten by each call. The readdir_r() function was added as a thread-safe alternative that returns values in a user-supplied buffer. However, it does not allow the size of the buffer to be specified by the caller, and so is only usable if {NAME_MAX} is a compile-time constant or fpathconf() with _SC_NAME_MAX returns a value other than -1. If {NAME_MAX} is indeterminate (indicated by fpathconf() returning -1), there is no way to reliably allocate a buffer large enough to hold a filename being returned by readdir_r(). Therefore, readdir_r() has been marked obsolescent and readdir() is now required to be thread safe as long as there are no concurrent calls to it on a single directory stream.

There are no notes attached to this issue.




Viewing Issue Advanced Details
1665 [Issue 8 drafts] System Interfaces Objection Error 2023-04-13 09:15 2023-06-27 15:17
geoffclare
 
normal  
Applied  
Accepted  
   
Geoff Clare
The Open Group
open()
1506
50451, 50458
Contradictory text in descriptions of O_EXEC and O_SEARCH
The changes from bug 0000658 talk about the possibility of open() with O_EXEC on a directory, or O_SEARCH on a non-directory file, opening the file with an unspecified access mode. However, there is no circumstance in which text elsewhere allows this to happen.

For O_EXEC on a directory:
  • If O_EXEC is the same value as O_SEARCH, the description of O_SEARCH applies and the directory is opened for searching.

  • If O_EXEC is not the same value as O_SEARCH, the ERRORS section mandates
    an EISDIR error.


Similarly for O_SEARCH on a non-directory file and ENOTDIR.
On page 1506 line 50451 section open() O_EXEC, change:
If path names a directory, it is unspecified whether open() fails, or whether the directory is opened but with an unspecified access mode.
to:
If path names a directory and O_EXEC is not the same value as O_SEARCH, open() shall fail.

On page 1506 line 50458 section open() O_SEARCH, change:
If path names a non-directory file, it is unspecified whether open() fails, or whether the file is opened but with an unspecified access mode.
to:
If path names a non-directory file and O_SEARCH is not the same value as O_EXEC, open() shall fail.

On page 1506 line 50462 section open(), delete:
If a file is successfully opened with an unspecified access mode, an application can use fcntl() to discover the access mode that was selected.

On page 1513 line 50753 section open() RATIONALE, change:
Although the standard allows open() to fail on an attempt to use O_EXEC on a directory, or O_SEARCH on a non-directory, this is only possible in implementations where the two modes have distinct values.
to:
Although this standard requires open() to fail on an attempt to use O_EXEC on a directory, or O_SEARCH on a non-directory, this only applies in implementations where the two modes have distinct values.

There are no notes attached to this issue.




Viewing Issue Advanced Details
1666 [Issue 8 drafts] System Interfaces Objection Error 2023-04-13 09:50 2023-06-27 15:20
geoffclare
 
normal  
Applied  
Accepted  
   
Geoff Clare
The Open Group
getresgid(), getresuid()
1170, 1171
40031, 40061
getresgid() and getresuid() are missing "restrict"
These functions modify multiple values of the same type via pointers passed as arguments, and there is no reason to allow applications to pass the same pointer value in two or more arguments, so the functions should have "restrict" in their prototypes.
On page 470 line 16562,16563 section <unistd.h>, change:
int    getresgid(gid_t *, gid_t *, gid_t *);
int    getresuid(uid_t *, uid_t *, uid_t *);
to:
int    getresgid(gid_t *restrict, gid_t *restrict, gid_t *restrict);
int    getresuid(uid_t *restrict, uid_t *restrict, uid_t *restrict);

On page 1170 line 40031 section getresgid(), change:
int getresgid(gid_t *rgid, gid_t *egid, gid_t *sgid);
to:
int getresgid(gid_t *restrict rgid, gid_t *restrict egid, gid_t *restrict sgid);

On page 1171 line 40061 section getresgid(), change:
int getresuid(uid_t *ruid, uid_t *euid, uid_t *suid);
to:
int getresuid(uid_t *restrict ruid, uid_t *restrict euid, uid_t *restrict suid);

Notes




Viewing Issue Advanced Details
1667 [Issue 8 drafts] System Interfaces Objection Omission 2023-04-13 09:54 2023-06-27 15:22
geoffclare
 
normal  
Applied  
Accepted  
   
Geoff Clare
The Open Group
2.4.3 Signal Actions
517
18340-18392
[gs]etres[gu]id() should be async-signal-safe
The new getresgid(), getresuid(), setresgid(), and setresuid() functions should be added to the list of async-signal-safe functions, since getegid(), geteuid(), getgid(), getuid(), setegid(), seteuid(), setgid(), setregid(), setreuid(), and setuid() are already in that list.
On page 517 line 18340-18392 section 2.4.3 Signal Actions, add:
getresgid(), getresuid(), setresgid(), and setresuid()
at the appropriate places in the table.
There are no notes attached to this issue.




Viewing Issue Advanced Details
1669 [Issue 8 drafts] Shell and Utilities Comment Enhancement Request 2023-04-17 14:15 2023-07-03 10:44
geoffclare
 
normal  
Applied  
Accepted  
   
Geoff Clare
The Open Group
ulimit
3442
117558
Make the ulimit utility consistent with [gs]etrlimit() wrt XSI
In draft 3 getrlimit() and setrlimit() were moved from XSI to Base (except for RLIMIT_CPU and RLIMIT_FSIZE). Since the ulimit utility provides corresponding functionality in shells, it should be made consistent with [gs]etrlimit() as regards what features are XSI.

There is a slight complication - the file size limit is the default resource for ulimit when no resource option is specified, and it would be strange to have the default be XSI. The reason RLIMIT_FSIZE is XSI is because of the SIGXFSZ signal, so the obvious solution is to treat only the SIGXFSZ requirements as XSI and to make the ability to (get and) set the file size limit be a Base requirement. This will also affect the EFBIG errors for functions that write to files and a few other file-size-limit-related things.
On page 398 line 13963 page <sys/resource.h>, remove XSI shading from RLIMIT_FSIZE.

On page 507 line 17919 section 2.3 Error Numbers (EFBIG), change:
The size of a file would exceed the maximum file size of an implementation or offset maximum established in the corresponding file description.
to:
The size of a file would exceed the implementation's maximum file size, the file size limit of the process, or the offset maximum established in the corresponding open file description.

On page 605 line 21517 section aio_write(), change:
[XSI]If the request would cause the file size to exceed the soft file size limit for the process and there is no room for any bytes to be written, the request shall fail and the implementation shall generate the SIGXFSZ signal for the thread.[/XSI]
to:
If the request would cause the file size to exceed the soft file size limit for the process and there is no room for any bytes to be written, the request shall fail [XSI]and the implementation shall generate a SIGXFSZ signal for the thread[/XSI].

On page 606 line 21550 section aio_write(), change:
[XSI][EFBIG]
The file is a regular file, aiobcp->aio_nbytes is greater than 0, and there is no room for any bytes to be written at the starting position without exceeding the file size limit for the process. A SIGXFSZ signal shall also be sent to the thread.[/XSI]
to:
[EFBIG]
The file is a regular file, aiobcp->aio_nbytes is greater than 0, and there is no room for any bytes to be written at the starting position without exceeding the file size limit for the process. [XSI]A SIGXFSZ signal shall also be generated for the thread.[/XSI]

On page 868 line 29616 section exec, remove XSI shading from:
File size limit (see getrlimit() and setrlimit())

On page 868 line 29620 section exec, remove XSI shading from:
Resource limits

On page 868 line 29644 section exec, remove XSI shading from:
The saved resource limits in the new process image are set to be a copy of the process' corresponding hard and soft limits.

On page 896 line 30567 section fclose(), and
page 939 line 31953 section fflush(), and
page 1005 line 34499 section fputc(), and
page 1009 line 34648 section fputwc(), and
page 1042 line 35806 section fseek(), and
page 1045 line 35922 section fsetpos(), change:
[XSI][EFBIG]
An attempt was made to write a file that exceeds the file size limit of the process. A SIGXFSZ signal shall also be sent to the thread.[/XSI]
to:
[CX][EFBIG]
An attempt was made to write a file that exceeds the file size limit of the process.[/CX] [XSI]A SIGXFSZ signal shall also be generated for the thread.[/XSI]

On page 1066 line 36626 section ftruncate(), and
page 2283 line 74481 section truncate(), change:
[XSI]If the request would cause the file size to exceed the soft file size limit for the process, the request shall fail and the implementation shall generate the SIGXFSZ signal for the thread.[/XSI]
to:
If the request would cause the file size to exceed the soft file size limit for the process, the request shall fail [XSI]and the implementation shall generate a SIGXFSZ signal for the thread[/XSI].

On page 1066 line 36648 section ftruncate(), and
page 2283 line 74496 section truncate(), change:
[XSI][EFBIG]
The length argument exceeds the file size limit of the process. A SIGXFSZ signal shall also be sent to the thread.[/XSI]
to:
[EFBIG]
The length argument exceeds the file size limit of the process. [XSI]A SIGXFSZ signal shall also be generated for the thread.[/XSI]

On page 1172 line 40121 page getrlimit(), change:
[XSI]RLIMIT_FSIZE
This is the maximum size of a file, in bytes, that can be created by a process. If a write or truncate operation would cause this limit to be exceeded, SIGXFSZ shall be generated for the thread. If the thread is blocking, or the process is catching or ignoring SIGXFSZ, continued attempts to increase the size of a file from end-of-file to beyond the limit shall fail with errno set to [EFBIG].[/XSI]
to:
RLIMIT_FSIZE
This is the maximum size of a file, in bytes, that can be created by a process. If a write or truncate operation would cause this limit to be exceeded, [XSI]a SIGXFSZ signal shall be generated for the thread; if the thread is blocking, or the process is catching or ignoring SIGXFSZ,[/XSI] the operation shall fail with an [EFBIG] error.

On page 1555 line 52234 section posix_fallocate(), change:
[XSI][EFBIG]
The value of offset+len exceeds the file size limit of the process. A SIGXFSZ signal shall also be sent to the thread.[/XSI]
to:
[EFBIG]
The value of offset+len exceeds the file size limit of the process. [XSI]A SIGXFSZ signal shall also be generated for the thread.[/XSI]

On page 2423 line 78472 section write(), change:
for example, [XSI]the file size limit of the process or[/XSI] the physical end of a medium
to:
for example, the file size limit of the process or the physical end of a medium

On page 2423 line 78477 section write(), change:
[XSI]If the request would cause the file size to exceed the soft file size limit for the process and there is no room for any bytes to be written, the request shall fail and the implementation shall generate the SIGXFSZ signal for the thread.[/XSI]
to:
If the request would cause the file size to exceed the soft file size limit for the process and there is no room for any bytes to be written, the request shall fail [XSI]and the implementation shall generate a SIGXFSZ signal for the thread[/XSI].

On page 2425 line 78551 section write(), change:
[XSI][EFBIG]
An attempt was made to write a file that exceeds the file size limit of the process, and there was no room for any bytes to be written. A SIGXFSZ signal shall also be sent to the thread.[/XSI]
to:
[EFBIG]
An attempt was made to write a file that exceeds the file size limit of the process, and there was no room for any bytes to be written. [XSI]A SIGXFSZ signal shall also be generated for the thread.[/XSI]

On page 2505 line 81712 section 2.13 Shell Execution Environment, remove XSI shading from:
File size limit as set by ulimit

On page 3442 line 117558 section ulimit, remove XSI shading from the entire SYNOPSIS except for the -t option.

On page 3442 line 117593 section ulimit, add XSI shading to the description of -t.
There are no notes attached to this issue.




Viewing Issue Advanced Details
1670 [Issue 8 drafts] System Interfaces Objection Omission 2023-04-17 15:28 2023-07-03 10:46
geoffclare
 
normal  
Applied  
Accepted  
   
Geoff Clare
The Open Group
fcntl()
904
30877
F_GETOWN_EX and F_SETOWN_EX missing from fcntl() RETURN VALUE
Bug 0001274 added F_GETOWN_EX and F_SETOWN_EX to the fcntl() DESCRIPTION but not to RETURN VALUE.
After:
F_SETOWN
Value other than -1.
add:
F_GETOWN_EX
Value other than -1.
F_SETOWN_EX
Value other than -1.

There are no notes attached to this issue.




Viewing Issue Advanced Details
1671 [Issue 8 drafts] System Interfaces Objection Error 2023-04-18 08:54 2023-07-03 10:48
geoffclare
 
normal  
Applied  
Accepted  
   
Geoff Clare
The Open Group
fcntl()
902
30778
OFD-owned locks addition missed some changes
The bug 0000768 changes on the fcntl() page to add OFD-owned file locks omitted needed updates to two paragraphs that describe shared locks and exclusive locks.
Change:
When a shared lock is set on a segment of a file, other processes shall be able to set shared locks on that segment or a portion of it. A shared lock prevents any other process from setting an exclusive lock on any portion of the protected area. A request for a shared lock shall fail if the file descriptor was not opened with read access.

An exclusive lock shall prevent any other process from setting a shared lock or an exclusive lock on any portion of the protected area. A request for an exclusive lock shall fail if the file descriptor was not opened with write access.
to:
When a shared lock is set on a segment of a file, other processes can set shared process-owned locks, and other open file descriptions can be used to set shared OFD-owned locks, on that segment or a portion of it. A shared process-owned lock shall prevent any other process from setting an exclusive process-owned lock, and shall prevent any exclusive OFD-owned lock from being set, on any portion of the protected area. A shared OFD-owned lock shall prevent any other open file description from being used to set an exclusive OFD-owned lock, and shall prevent any exclusive process-owned lock from being set, on any portion of the protected area. A request for a shared lock shall fail if the file descriptor is not open for reading.

An exclusive process-owned lock shall prevent any other process from setting a shared or exclusive process-owned lock, and shall prevent any shared or exclusive OFD-owned lock from being set, on any portion of the protected area. An exclusive OFD-owned lock shall prevent any other open file description from being used to set a shared or exclusive OFD-owned lock, and shall prevent any shared or exclusive process-owned lock from being set, on any portion of the protected area. A request for an exclusive lock shall fail if the file descriptor is not open for writing.

There are no notes attached to this issue.




Viewing Issue Advanced Details
1672 [Issue 8 drafts] System Interfaces Comment Clarification Requested 2023-04-18 09:11 2023-07-03 10:49
geoffclare
 
normal  
Applied  
Accepted  
   
Geoff Clare
The Open Group
lockf()
1355
45590
Wording improvement to lockf() APPLICATION USAGE
Bug 0000768 added a paragraph to the fcntl() APPLICATION USAGE that was based on existing text on the lockf() page, but with improved wording. The same wording improvement should be made on the lockf() page.
Change:
Record-locking should not be used in combination with the fopen(), fread(), fwrite(), and other stdio functions. Instead, the more primitive, non-buffered functions (such as open()) should be used. Unexpected results may occur in processes that do buffering in the user address space. The process may later read/write data which is/was locked. The stdio functions are the most common source of unexpected buffering.
to:
Record-locking should not be used in combination with buffered standard I/O streams (see [xref to Section 2.5]). Instead, non-buffered I/O should be used. Unexpected results may occur in processes that do buffering in the user address space. The process may later read/write data which is/was locked. Functions that operate on standard I/O streams are the most common source of such buffering.

There are no notes attached to this issue.




Viewing Issue Advanced Details
1673 [Issue 8 drafts] Rationale Comment Error 2023-04-18 15:01 2023-07-03 10:52
geoffclare
 
normal  
Applied  
Accepted  
   
Geoff Clare
The Open Group
B.3.1
3803-3806
131760-131838
Rationale about removed interfaces needs updating
XRAT B.3.1 lists the interfaces removed in the revision and gives advice about what to use instead. Updating it is not a purely editorial matter since the advice given needs to be reviewed.

In the proposed changes I did not think it worth listing the functions that were in the STREAMS and Tracing options individually.
Replace the entire contents of B.3.1 (and its subsections) with:
This section contains a list of options and interfaces removed in POSIX.1-202x, together with advice for application developers on the alternative interfaces that should be used.
B.3.1.1 STREAMS Option
Applications are recommended to use UNIX domain sockets as an alternative for much of the functionality provided by this option. For example, file descriptor passing can be performed using sendmsg() and recvmsg() with SCM_RIGHTS on a UNIX domain socket instead of using ioctl() with I_SENDFD and I_RECVFD on a STREAM.
B.3.1.2 Tracing Option
Applications are recommended to use implementation-provided extension interfaces instead of the functionality provided by this option. (Such interfaces were in widespread use before the Tracing option was added to POSIX.1 and continued to be used in preference to the Tracing option interfaces.)
B.3.1.3 _longjmp() and _setjmp()
Applications are recommended to use siglongjmp() and sigsetjmp() instead of these functions.
B.3.1.4 _tolower() and _toupper()
Applications are recommended to use tolower() and toupper() instead of these functions.
B.3.1.5 ftw()
Applications are recommended to use nftw() instead of this function.
B.3.1.6 getitimer() and setitimer()
Applications are recommended to use timer_gettime() and timer_settime() instead of these functions.
B.3.1.7 gets()
Applications are recommended to use fgets() instead of this function.
B.3.1.8 gettimeofday()
Applications are recommended to use clock_gettime() instead of this function.
B.3.1.9 isascii() and toascii()
Applications are recommended to use macros equivalent to the following instead of these functions:
#define isascii(c) (((c) & ~0177) == 0)
#define toascii(c) ((c) & 0177)
An alternative replacement for isascii(), depending on the intended outcome if the code is ported to implementations with different character encodings, might be:
#define isascii(c) (isprint((c)) || iscntrl((c)))
(In the C or POSIX locale, this determines whether c is a character in the portable character set.)
B.3.1.10 pthread_getconcurrency() and pthread_setconcurrency()
Applications are recommended to use thread scheduling (on implementations that support the Thread Execution Scheduling option) instead of these functions; see [xref to XSH 2.9.4 Thread Scheduling].
B.3.1.11 rand_r()
Applications are recommended to use nrand48() or random() instead of this function.
B.3.1.12 setpgrp()
Applications are recommended to use setpgid() or setsid() instead of this function.
B.3.1.13 sighold(), sigpause(), and sigrelse()
Applications are recommended to use pthread_sigmask() or sigprocmask() instead of these functions.
B.3.1.14 sigignore(), siginterrupt(), and sigset()
Applications are recommended to use sigaction() instead of these functions.
B.3.1.15 tempnam()
Applications are recommended to use mkdtemp(), mkstemp(), or tmpfile() instead of this function.
B.3.1.16 ulimit()
Applications are recommended to use getrlimit() or setrlimit() instead of this function.
B.3.1.17 utime()
Applications are recommended to use futimens() if a file descriptor for the file is open, otherwise utimensat(), instead of this function.

There are no notes attached to this issue.




Viewing Issue Advanced Details
1674 [1003.1(2016/18)/Issue7+TC2] Shell and Utilities Objection Omission 2023-04-19 02:35 2023-09-05 11:09
eblake
 
normal  
Applied  
Accepted As Marked  
   
Eric Blake
Red Hat
ebb.posix_spawnp
XSH posix_spawnp
1455
48328
Approved
Note: 0006411
may posix_spawnp() fail with ENOEXEC?
I'm raising this based on a thread on the Cygwin mailing list: https://cygwin.com/pipermail/cygwin/2023-April/253495.html [^]

The standard is clear that while execl(), execle(), execv(), execve(), and fexecve() can fail with ENOEXEC, execlp() and execvp() must instead fall back to an invocation of 'sh' (see page 784 line 26548, and page 788 line 26744). However, when it comes to posix_spawn() and posix_spawnp(), the standard is silent as to whether a fallback to 'sh' must be attempted.

At a first glance, one might assume that similarity in naming to the exec counterparts means that posix_spawn() matches execv() (no PATH search attempted, no sh fallback), and posix_spawnp() matches execvp() (utilize PATH search, attempt sh fallback). However, behavior differs between platforms (some fall back to sh for both spawn variants, Cygwin only for posix_spawnp, and glibc has refused the fallback for any file not starting with #! since 2.15 on Linux (patched in 2011, https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=d96de963), [^] and since 2.33 on Hurd (patched in 2020, https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=13adfa34; [^] musl also does not fall back on either function but also doesn't fall back to sh on execvp() so it is not a factor here).

But a more precise read of the current wording does not actually require what should happen on ENOEXEC - page 1452 line 48229 describes the PATH search aspect of posix_spawnp(), but has no mention of a fallback to sh; and page 1455 line 48329 states that posix_spawn[p]() may fail for the same reasons as one of the exec family without specifying any particular mapping between the 2 spawn functions and the 7 exec functions. That is, I could come up with a weirdnix where posix_spawn() uses execvp() with a filename containing '/' (no PATH search but does get a fallback to sh), while posix_spawnp() implements its own PATH search to open an fd and use fexecve() which can see ENOEXEC, and that would still be compliant given the current ambiguous wording).

More to the point, the non-normative XRAT B.3.3 starting on page 3694 line 126541 gives a sample implementation of posix_spawn() (but not posix_spawnp()) that uses execve() (line 126790) and therefore can fail with ENOEXEC in the child process but has exit status 127 (confusing, since everywhere else in the standard we try to turn ENOEXEC into exit status 126 - but consistent with the current requirements on posix_spawn). Furthermore, the standard is clear that posix_spawn[p] can utilize a different operating system hook than the exec family; where it may be impractical to even attempt a fallback to sh (although the standard shows an implementation using fork/exec where the child process could still attempt a second execve() after the ENOEXEC failure, the intent was that there may be some other system-specific mechanism for creating a child process that does NOT have any way of doing a second attempt short of reapplying the full set of file actions in the parent process).

Additionally, on systems with setuid binaries, application writers PREFER to use posix_spawn*() over exec*() to have more control over starting a child process with known characteristics. If a setuid binary can end up trying to invoke a child process on something that would give an ENOEXEC error, and blindly attempts to hand that to 'sh', but it is not a valid shell script, this could easily be exploited as a security hole when sh starts executing random data as a shell script (which is the rationale glibc gave for disabling the fallback to sh for non-#! files on both spawn variants). Of course, this standard doesn't specify #! behavior; that's implementation-defined (and on Linux, starting a file with #! is handled by the kernel as a known executable format that can't fail with ENOEXEC, so only the files without #! can even reach the point where glibc attempts an sh fallback when it wants to in execvp).

Although the XRAT is non-normative, I think we are better off mandating that posix_spawn and posix_spawnp do NOT attempt an sh fallback on ENOEXEC failure; but the group may decide that is at too much risk of breaking some existing implementations that depend on an sh fallback, and decide to water this down into instead declaring the sh fallback to be implementation-defined (where glibc would then define that there is no sh fallback).
At page 1452 line 48234 (XSH posix_spawn DESCRIPTION), add a sentence:
However, if at least one of the exec family of functions would fail with [ENOEXEC] because the process image contents are not executable, this shall cause posix_spawnp( ) to fail rather than attempting a fallback to invoking the process as a shell script passed to sh.


At page 1455 line 48328 (XSH posix_spawn ERRORS), change:
If posix_spawn( ) or posix_spawnp( ) fail for any of the reasons that would cause fork( ) or one of the exec family of functions to fail, an error value shall be returned as described by fork( ) and exec, respectively (or, if the error occurs after the calling process successfully returns, the child process shall exit with exit status 127).
to:
If posix_spawn( ) or posix_spawnp( ) fail for any of the reasons that would cause fork( ) or one of the exec family of functions to fail, including when the corresponding exec function would attempt a fallback to sh instead of failing with [ENOEXEC], an error value shall be returned as described by fork( ) and exec, respectively (or, if the error occurs after the calling process successfully returns, the child process shall exit with exit status 127).
Notes
(0006264)
eblake   
2023-04-19 12:36   
While we're touching this, it may be worth acknowledging that the XRAT example is not robust (among other things, it mishandles file names containing things like '*', and can easily overwrite past the bounds of an array when passed a long filename).

At page 3695 line 126556 (XRAT B.3.3), change:
The effective behavior of a successful invocation of posix_spawn( ) is as if the operation were implemented with POSIX operations as follows:
to:
The example below demonstrates an initial approach to implementing posix_spawn( ) using other POSIX operations, although an actual implementation will need to be more robust at handling all possible file names.


(0006411)
geoffclare   
2023-07-31 16:06   
Interpretation response
------------------------

The standard does not speak to this issue, and as such no conformance distinction can be made between alternative implementations based on this. This is being referred to the sponsor.

Rationale:
-------------
None.

Notes to the Editor (not part of this interpretation):
-------------------------------------------------------
Make the changes in the Desired Action, and also the changes in Note: 0006264.




Viewing Issue Advanced Details
1675 [Issue 8 drafts] Shell and Utilities Editorial Error 2023-04-20 10:57 2023-07-03 10:54
geoffclare
 
normal  
Applied  
Accepted  
   
Geoff Clare
The Open Group
2.11 Job Control
2504
81683-81684
Copy and paste error in new job control section
In the resolution of bug 0001254 there was a copy and paste error. In draft 3 the lines in question are at 81683-81684. These lines resulted from copying and pasting earlier lines (81666-81667 in draft 3) and updating them, but the update omitted a needed change.

(Credit to kre for spotting this.)
Change:
the message shall be written immediately after the job becomes suspended
to:
the message shall be written immediately after the job completes or is terminated

There are no notes attached to this issue.




Viewing Issue Advanced Details
1676 [Issue 8 drafts] Shell and Utilities Editorial Error 2023-04-21 09:30 2023-07-03 10:55
gbrandenrobinson
 
normal  
Applied  
Accepted As Marked  
   
G. Branden Robinson
vi
3536
120971-120973
Note: 0006271
spurious use of boldface
The part of the sentence after "For example," is in boldface for no reason obvious to me.
Remove the bold markup tags from the sentence.
Notes
(0006271)
geoffclare   
2023-04-24 08:41   
In Issue 6 just the initial 3 was bold; everything from <control-F> on was not. This changed in Issue 7 when we switched to groff and the cause is a difference between historical troff and groff. The source has:

.B 3\c

With groff the bold is retained for the lines following this, whereas with historical troff it was not.

Changing the line to:

.B 3 \c

will fix it.




Viewing Issue Advanced Details
1677 [Issue 8 drafts] Shell and Utilities Editorial Error 2023-04-21 09:38 2023-07-03 10:57
gbrandenrobinson
 
normal  
Applied  
Accepted As Marked  
   
G. Branden Robinson
xgettext
3575
122507-122508
Note: 0006322
"Future Directions" text outdated
The future direction about the "-n" option being described with "shall" has already come to pass; see lines 122365-122366.
Replace this section of the man page with "None.".
Notes
(0006322)
geoffclare   
2023-06-12 16:11   
(edited on: 2023-06-12 16:13)
Change on L122507-122508:
A future version of this standard may change the description of the −n option to use ``shall’’ instead of ``should’’.
to:
A future version of this standard may change the description of the −n option to mandate the given comment format (by using ``shall’’ instead of ``should’’).






Viewing Issue Advanced Details
1678 [Issue 8 drafts] Shell and Utilities Editorial Error 2023-04-21 09:44 2023-07-03 10:59
gbrandenrobinson
 
normal  
Applied  
Accepted As Marked  
   
G. Branden Robinson
timeout
3412
116428-116430
Note: 0006323
use correct word: "descendant"
Online authorities seem to conflict over the correct adjectival form of this word, but in usage as a noun, as here, they are more strongly aligned: the standard should speak of descendants, not *descendents.
Correct the spelling error.
Notes
(0006323)
geoffclare   
2023-06-12 16:17   
On page 3412-3416 line 116456-116636 section timeout, change all occurrences of "descendent" to "descendant".




Viewing Issue Advanced Details
1679 [Issue 8 drafts] Shell and Utilities Editorial Clarification Requested 2023-04-21 10:00 2023-07-03 11:00
gbrandenrobinson
 
normal  
Applied  
Accepted As Marked  
   
G. Branden RObinson
msgfmt
3166
106981
See Note: 0006324.
strictly increasing order vs. monotonic
"the application shall ensure that the statement containing the msgid directive is immediately followed by a msgid_plural directive and that each statement containing a msgid_plural directive is followed by count statements containing msgstr[index] directives, starting with msgstr[0] and ending with msgstr[count−1] in monotonically increasing order."

Shouldn't the requirement on the application be that it shall use a _strictly_ increasing order?

If not, and if "monotonically" is truly meant, should something about which statements with duplicate indices shall prevail?

My understanding is that in computer science applications, we can generally read "monotonically increasing" as a synonym for "nondecreasing". But often what we mean is "strictly increasing".

Unfortunately I lack the training to venture an opinion on whether, say, the Weierstrass function W(x) is monotonically increasing in the neighborhood of x. But I think I know enough to say that I'm sure I'd get into trouble before properly studying real analysis.
Clarify for non-mathematicians, and those who gaze upon credentialed mathematicians with envy.
Notes
(0006324)
Don Cragun   
2023-06-12 16:34   
(edited on: 2023-06-12 16:41)
On L106981, Page 3166
Change:
in monotonically increasing order.
to:
in increasing order, with no duplicate index values.






Viewing Issue Advanced Details
1681 [Issue 8 drafts] Base Definitions and Headers Comment Omission 2023-04-23 15:28 2023-07-06 10:15
dennisw
 
normal  
Applied  
Accepted  
   
Dennis Wölfing
devctl.h
234
8262
<devctl.h> should define size_t
The posix_devctl function takes a parameter of type size_t, but the <devctl.h> header is not required to define that type.
On page 234 after line 8262 add:
The <devctl.h> header shall define the size_t type as described in <sys/types.h>.
There are no notes attached to this issue.




Viewing Issue Advanced Details
1682 [Issue 8 drafts] Base Definitions and Headers Comment Omission 2023-04-23 15:30 2023-07-06 10:16
dennisw
 
normal  
Applied  
Accepted  
   
Dennis Wölfing
libintl.h
280
9709
<libintl.h> should define locale_t
The *_l functions in <libintl.h> each take a parameter of type locale_t, but the <libintl.h> header is not required to define that type.
On page 280 after line 9709 add:
The <libintl.h> header shall define the locale_t type as described in <locale.h>.
There are no notes attached to this issue.




Viewing Issue Advanced Details
1683 [Issue 8 drafts] System Interfaces Comment Error 2023-04-23 15:33 2023-07-06 10:18
dennisw
 
normal  
Applied  
Accepted  
   
Dennis Wölfing
posix_devctl
1548
51937
nbyte cannot be negative in posix_devctl
One of the error cases for EINVAL is that nbyte is negative. However nbyte is of type size_t which is unsigned.
On page 1548 line 51937 change
The nbyte argument is negative, or exceeds ...
to
The nbyte argument exceeds ...
There are no notes attached to this issue.




Viewing Issue Advanced Details
1684 [Issue 8 drafts] Rationale Comment Error 2023-04-24 14:42 2023-07-06 10:24
geoffclare
 
normal  
Applied  
Accepted  
   
Geoff Clare
The Open Group
E.1 Subprofiling Option Groups
3907-3913
135750-136016
Various fixed needed to subprofiling groups
Changes to the subprofiling groups have sometimes been overlooked when changes affecting them have been made to the normative text.

Note that interfaces that are part of "POSIX options" (i.e. options other than XSI) or part of XSI option groups are not included, so posix_devctl() was correctly not added because it is part of the DC option.

The following functions and function-like macros are missing from the subprofiling groups:

CMPLX()
CMPLXF()
CMPLXL()
_Fork()
asprintf()
be16toh()
be32toh()
be64toh()
futimens()
htobe16()
htobe32()
htobe64()
htole16()
htole32()
htole64()
le16toh()
le32toh()
le64toh()
nl_langinfo_l()
open_wmemstream()
posix_close()
pthread_barrierattr_destroy()
pthread_barrierattr_getpshared()
pthread_barrierattr_init()
pthread_barrierattr_setpshared()
sched_yield()
strerror_l()
strftime_l()
towlower_l()
towupper_l()
utimes()
vasprintf()

The following is in the POSIX_BARRIERS group but does not exist:

pthread_barrierattr()

(It may perhaps have been intended to be pthread_barrierattr_*(), but there are no other uses of wildcards in the subprofiling groups, so this should be replaced with the appropriate functions from the "missing" list above.)

The following should not be in the subprofiling groups because they are part of a POSIX option:

pthread_attr_getstack()
pthread_attr_setstack()

The following are in XSI subprofiling groups but should be in POSIX groups:

getrlimit()
setrlimit()

In addition, there is an inconsistency in how the "pw" and "gr" functions are split between XSI_SYSTEM_DATABASE and XSI_USER_GROUPS: endpwent(), getpwent(), and setpwent() are in XSI_SYSTEM_DATABASE but endgrent(), getgrent(), and setgrent() are in XSI_USER_GROUPS. The latter should be moved to XSI_SYSTEM_DATABASE (rather than the other way round) because getgrgid() and getgrnam() are in POSIX_SYSTEM_DATABASE.
On page 3907 line 135754 section E.1 (POSIX_BARRIERS), change:
pthread_barrierattr()
to:
pthread_barrierattr_destroy(), pthread_barrierattr_getpshared(), pthread_barrierattr_init(), pthread_barrierattr_setpshared()

After page 3907 line 135766 add CMPLX(), CMPLXF(), and CMPLXL() to POSIX_C_LANG_MATH.

After page 3909 line 135834 add posix_close() to POSIX_DEVICE_IO.

After page 3909 line 135840 add asprintf() and vasprintf() to POSIX_DEVICE_IO_EXT.

After page 3909 line 135863 add futimens() to POSIX_FILE_SYSTEM.

After page 3910 line 135895 add _Fork() to POSIX_MULTI_PROCESS.

After page 3910 line 135901 add be16toh(), be32toh(), be64toh(), htobe16(), htobe32(), htobe64(), htole16(), htole32(), htole64(), le16toh(), le32toh(), and le64toh() to POSIX_NETWORKING.

After page 3910 line 135876 add nl_langinfo_l() to POSIX_I18N.

After page 3910 line 135887 add strerror_l(), strftime_l(), towlower_l(), and towupper_l() to POSIX_MULTI_CONCURRENT_LOCALES.

After page 3910 line 135895, add getrlimit() and setrlimit() to POSIX_MULTI_PROCESS

After page 3912 line 135975 add open_wmemstream() to POSIX_WIDE_CHAR_DEVICE_IO.

After page 3911 line 135950 add sched_yield() to POSIX_THREADS_BASE.

After page 3912 line 135991 add utimes() to XSI_FILE_SYSTEM.

On page 3912 line 136001 delete getrlimit() and setrlimit() from XSI_MULTI_PROCESS

After page 3913 line 136006 add endgrent(), getgrent(), and setgrent() to XSI_SYSTEM_DATABASE.

On page 3913 line 136010 delete:
XSI_THREADS_EXT: XSI Threads Extensions
pthread_attr_getstack(), pthread_attr_setstack()

On page 3913 line 136013-136014 delete endgrent(), getgrent(), and setgrent() from XSI_USER_GROUPS.
There are no notes attached to this issue.




Viewing Issue Advanced Details
1685 [Issue 8 drafts] Front Matter Objection Error 2023-04-25 09:38 2023-07-06 10:31
geoffclare
 
normal  
Applied  
Accepted  
   
Geoff Clare
The Open Group
Referenced Documents
xxxvi-xxxvii
N/A
RFC references need updating
Some of the RFCs we reference have been superseded. We should change to refer to the new ones.

We list RFCs 791, 2292, and 2460 in the Referenced Documents section, but there are no references to them. I have found suitable places to reference 791 and 2460 (actually 8200 which obsoletes 2460), but I don't see the need for a reference to 2292 since it concerns the use of raw sockets over IPv6 and XSH 2.10.6 says of raw socket datagrams that "the formats are protocol-specific and implementation-defined." So application writers need to refer to an implementation's documentation to see how to use raw sockets anyway (which might well say it supports RFC 2292, but on the other hand it might not). Without an actual reference to it, 2292 should be deleted from the list (otherwise the introductory text "The following documents are referenced in POSIX.1-202x" will be untrue).

We have also added references to RFC 6557 within the text but not added it to the Referenced Documents section.
On page xxxvi para 6, delete:
IETF RFC 822
Standard for the Format of ARPA Internet Text Messages, D.H. Crocker, August 1982 (available at: www.ietf.org/rfc/rfc0822.txt).
and delete the "Notes to Reviewers" below it.

On page xxxvi last para, delete:
IETF RFC 1886
DNS Extensions to Support Internet Protocol, Version 6 (IPv6), C. Huitema, S. Thomson, December 1995 (available at: www.ietf.org/rfc/rfc1886.txt).

On page xxxvii para 5-7, delete:
IETF RFC 2292
Advanced Sockets API for IPv6, W. Stevens, M. Thomas, February 1998 (available at: www.ietf.org/rfc/rfc2292.txt).
IETF RFC 2373
Internet Protocol, Version 6 (IPv6) Addressing Architecture, S. Deering, R. Hinden, July 1998 (available at: www.ietf.org/rfc/rfc2373.txt).
IETF RFC 2460
Internet Protocol, Version 6 (IPv6), S. Deering, R. Hinden, December 1998 (available at: www.ietf.org/rfc/rfc2460.txt).

On page xxxvii para 8, insert:
IETF RFC 3596
DNS Extensions to Support IP Version 6, S. Thomson, C. Huitema, V. Ksinant, M. Souissi, October 2003 (available at: www.ietf.org/rfc/rfc3596.txt).
IETF RFC 4291
IP Version 6 Addressing Architecture, R. Hinden, S. Deering, February 2006 (available at: www.ietf.org/rfc/rfc4291.txt).
IETF RFC 5322
Internet Message Format, P. Resnick, October 2008 (available at: www.ietf.org/rfc/rfc5322.txt).
IETF RFC 6557
Procedures for Maintaining the Time Zone Database, E. Lear, P. Eggert, February 2012 (available at: www.ietf.org/rfc/rfc6557.txt).
IETF RFC 8200
Internet Protocol, Version 6 (IPv6) Specification, S. Deering, R. Hinden, July 2017 (available at: www.ietf.org/rfc/rfc8200.txt).
before:
Internationalisation Guide

On page 555 line 19952 section 2.10.19, after:
Support for sockets over Internet protocols based on IPv4 is mandatory.
add:
IPv4 is described in RFC 791.

After page 555 line 19966 section 2.10.20, add a new paragraph:
IPv6 is described in RFC 8200.

On page 555 line 19971 section 2.10.20.1, and
page 1235 line 42213 section inet_ntop(), change:
RFC 2373
to:
RFC 4291

On page 840 line 28773 section endhostent(), and
page 1018 line 34929 section freeaddrinfo(), change:
RFC 1886
to:
RFC 3596

On page 3099 line 104305 section mailx, change:
RFC 5322 (which succeeded RFC 822)
to:
RFC 5322

There are no notes attached to this issue.




Viewing Issue Advanced Details
1686 [Issue 8 drafts] System Interfaces Objection Error 2023-04-25 15:12 2023-07-06 10:34
geoffclare
 
normal  
Applied  
Accepted  
   
Geoff Clare
The Open Group
strtod(), wcstod()
2159, 2383
77264, 70631
Missing CX shading for underflow ERANGE in strtod() and wcstod()
The strtod() page correctly has CX shading in the ERRORS section on the requirement for errno to be set ERANGE when the return value would underflow. (C17 says it is implementation-defined whether errno is set to ERANGE.)

However, the RETURN VALUE section also has a statement about this and it is missing the CX shading.

The wcstod() page is worse: it is missing the CX shading in both places.

A related inconsistency between strtod() and wcstod() is that for overflow, strtod() has a condition on the rounding mode whereas wcstod() does not. C17 states the same condition for both.
On page 2159 line 70631 section strtod(), and
page 2383 line 77264 section wcstod(), change:
... shall be returned and errno set to [ERANGE].
to:
... shall be returned [CX]and errno set to [ERANGE][/CX].

On page 2383 line 77260 section wcstod(), change:
If the correct value is outside the range of representable values
to:
If the correct value would cause an overflow and default rounding is in effect

On page 2383 line 77268 section wcstod(), change:
The value to be returned would cause overflow or underflow.
to:
The value to be returned would cause overflow and default rounding is in effect [CX]or the value to be returned would cause underflow[/CX].

There are no notes attached to this issue.




Viewing Issue Advanced Details
1687 [Issue 8 drafts] Base Definitions and Headers Objection Clarification Requested 2023-05-03 10:30 2023-07-06 10:37
geoffclare
 
normal  
Applied  
Accepted  
   
Geoff Clare
The Open Group
5 File Format Notation
115
3632
Mismatch between blanks in file formats and default IFS
In the resolution of bug 0001532 the stty example code for restoring the terminal size uses awk because the stty utility is allowed to include blanks from the current locale (not just blank from the portable character set) around the numeric fields. It was mentioned in Note: 0005661 that if we change XBD chapter 5 so that it allows implementations to add blanks only from the portable character set, then the example code could be changed back to using shell field splitting instead of using awk.

I believe the reason for allowing blanks around numeric fields was because of implementation differences noted during work on the original POSIX.2 drafts, such as the output of "wc -l" including leading blanks on some systems but not others. These differences at the time would only have involved space and tab characters, no other blanks, so there was no reason to allow other blanks in the original POSIX.2-1992 standard.

There are likely a large number of applications that expect to be able to use the default IFS to do field splitting on such output, and at the moment this is not guaranteed to work (in locales other than C and POSIX). We should change that.

For consistency, the other use of <blank> in XBD chapter 5 should change to match.
On page 113 line 3543 section 5, change:
' ' (An empty character position.) Represents one or more <blank> characters.
to:
' ' (An empty character position.) Represents one or more <blank> characters from the portable character set.

On page 115 line 3632 section 5, change:
with <blank> characters
to:
with <blank> characters from the portable character set

On page 3379 line 115321 section stty, change:
stty size | awk '{printf "stty rows %d cols %d", $1, $2}'
to:
printf "stty rows %d cols %d" $(stty size)

There are no notes attached to this issue.




Viewing Issue Advanced Details
1688 [Issue 8 drafts] Base Definitions and Headers Editorial Omission 2023-05-03 14:13 2023-07-06 10:39
geoffclare
 
normal  
Applied  
Accepted  
   
Geoff Clare
The Open Group
1.1 Scope
3-4
37-40
Scope is missing the "Additional APIs" documents and ISO TR24731-2
XBD 1.1 Scope lists the base documents from which the facilities provided in the standard are drawn. Currently only POSIX.1-2017, POSIX.26-2003, and C17 are listed. The "Additional APIs" documents containing the new interfaces sponsored by The Open Group, and ISO TR24731-2 (from which [v]asprintf() were drawn) are missing.
After page 4 line 40 add three items to the bullet list:
  • ISO/IEC TR 24731-2:2010, Programming languages, their environments and system software interfaces -- Extensions to the C library -- Part 2: Dynamic Allocation Functions

  • The Open Group Standard, 2021, Additional APIs for the Base Specifications Issue 8, Part 1

  • The Open Group Standard, 2022, Additional APIs for the Base Specifications Issue 8, Part 2

There are no notes attached to this issue.




Viewing Issue Advanced Details
1689 [Issue 8 drafts] Shell and Utilities Editorial Clarification Requested 2023-05-07 10:58 2023-07-06 10:41
rillig
 
normal  
Applied  
Accepted  
   
Roland Illig
make
3103
104470
Maybe remove 'the user shall'
Line 104470 says:
> The user shall ensure that a portable makefile shall:

As a reader, I don't understand why 'the user shall' do something in this situation instead of leaving it unspecified.

What would effectively change if the words 'the user shall ensure that' were simply removed?
Remove the words 'the user shall ensure that', with no intended change in meaning.
There are no notes attached to this issue.




Viewing Issue Advanced Details
1690 [Issue 8 drafts] Shell and Utilities Editorial Clarification Requested 2023-05-07 11:05 2023-07-06 10:42
rillig
 
normal  
Applied  
Accepted As Marked  
   
Roland Illig
make
3103
104486, 104540
Note: 0006336
Use consistent wording for make's options
Line 104486 says:
> This mode is the same

Line 104540 says:
> This mode shall be the same

Since these two requirements seem to be structurally the same, they should use the same wording.
Reword one of the lines to match the other.
Notes
(0006336)
geoffclare   
2023-06-15 16:05   
On line 104486 change:
This mode is the same
to:
This mode shall be the same




Viewing Issue Advanced Details
1692 [Issue 8 drafts] Shell and Utilities Editorial Clarification Requested 2023-05-07 11:21 2023-07-06 10:44
rillig
 
normal  
Applied  
Accepted As Marked  
   
Roland Illig
make
3103
104519, 104529, 104542
Note: 0006337
Options -n, -q, -t use different wording
The option -n executes lines that use the macro MAKE, while the options -q and -t don't. Whether this difference in behavior is intended is unclear from reading the section.
Explicitly clarify that the options behave differently, either in the options themselves or in the rationale.

Or: Align the behavior of the options to each other.
Notes
(0006337)
geoffclare   
2023-06-15 16:23   
On page 3104 line 104532 section make (-q), and
page 3105 line 104548 section make (-t) change:
it is unspecified whether command lines that do not have a <plus-sign> prefix and are being processed ...
to:
it is unspecified whether command lines that do not have a <plus-sign> prefix and either expand the MAKE macro or are being processed ...




Viewing Issue Advanced Details
1693 [Issue 8 drafts] Shell and Utilities Objection Clarification Requested 2023-05-07 11:29 2023-07-10 10:36
rillig
 
normal  
Applied  
Accepted As Marked  
   
Roland Illig
make
3105
104539, 104755
See Note: 0006338,
Align behavior of -s with existing practice
Line 104539 says:
> Do not write makefile command lines

Line 104755 says:
> An _execution line_ is built from the command line by removing any prefix characters.

Both GNU and BSD make write the _execution line_, not the _command line_, contradicting the specification. I didn't test other implementations but I guess the specification doesn't codify historical practice in this case.
In line 104539, change:
Do not write makefile command lines

to:
Do not write makefile execution lines


Add a cross reference to the term 'execution line'.
Notes
(0006338)
Don Cragun   
2023-06-19 15:19   
Change in make options on P3105, L104539:
    
Do not write makefile command lines
to:
Do not write makefile execution lines (see Makefile Execution on page xxx)




Viewing Issue Advanced Details
1694 [Issue 8 drafts] Shell and Utilities Editorial Clarification Requested 2023-05-07 11:34 2023-07-10 10:37
rillig
 
normal  
Applied  
Accepted  
   
Roland Illig
make
3103, 3105
104484, 104567
Use consistent wording for 'in the order specified'
Line 104484 says:
> shall be processed in the order specified

Line 104567 says:
> shall be processed in the order they appear

If these sentences are intended to mean the same, they should use the same wording.
In line 104567, change:
in the order they appear

to:
in the order specified


With no change in meaning intended.
There are no notes attached to this issue.




Viewing Issue Advanced Details
1696 [Issue 8 drafts] Shell and Utilities Editorial Clarification Requested 2023-05-07 12:16 2023-07-10 10:39
rillig
 
normal  
Applied  
Accepted As Marked  
   
Roland Illig
make
3107
104629
See Note: 0006339.
Effect of -s when no work needs to be done
Line 104629 says:
> If make is invoked without any work needing to be done, it shall write a message to standard output indicating that no action was taken.

The sentence above the cited sentence explicitly specifies what happens when the -s option is given, but the cited sentence doesn't.

Does the standard intentionally omit to specify the effect of the -s option?

When no work needs to be done and make is run with the -s option, GNU make is silent while BSD make writes the message.
If intended, specify that the behavior of -s with no work to be done is unspecified.

If the currently specified behavior is intended, mention that -s has no effect in this case, to clearly distinguish it from the preceding sentence in the same paragraph.
Notes
(0006339)
Don Cragun   
2023-06-19 16:06   
(edited on: 2023-06-22 15:11)
Change in make STDOUT section on P3107, L104627-104633:
The make utility shall write all commands to be executed to standard output unless the −s option was specified, the command is prefixed with an at-sign, or the special target .SILENT has either the current target as a prerequisite or has no prerequisites. If make is invoked without any work needing to be done, it shall write a message to standard output indicating that no action was taken. If the −t option is present and a file is touched, make shall write to standard output a message of unspecified format indicating that the file was touched, including the filename of the file.
to:
If make is invoked without any work needing to be done, it may write a message to standard output indicating that no action was taken. Otherwise, the make utility shall write all commands to be executed (and the filenames of files touched for the -t option in a message of unspecified format) to standard output unless the −s option was specified, the command is prefixed with an at-sign, or the special target .SILENT has either the current target as a prerequisite or has no prerequisites.






Viewing Issue Advanced Details
1697 [Issue 8 drafts] Shell and Utilities Editorial Enhancement Request 2023-05-07 12:24 2023-07-10 10:41
rillig
 
normal  
Applied  
Accepted As Marked  
   
Roland Illig
make
3103-3136
several
See Note: 0006343.
Consider replacing out-of-date with not up-to-date
In the specification of the make utility, the term up-to-date occurs 45 times, the (supposedly opposite) term out-of-date occurs 12 times. To make the specification conceptually simpler, it would suffice to use only the term up-to-date consistently throughout the whole specification.
Reword the 12 places that use the term out-of-date to use the negated form of the corresponding up-to-date instead.

Alternatively, explicitly define that out-of-date means the exact opposite of up-to-date, and that there is no range of undefinedness between them.
Notes
(0006343)
Don Cragun   
2023-06-22 15:48   
In make rationale on P3131, L105687 after:
The HP-UX implementation of make treated it as out-of-date.
add a new sentence:
Note that up-to-date and out-of-date are antonyms.


In make future directions add a new paragraph after P3133, L105820.
A future version of this standard may require that a target with a prerequisite with an identical timestamp is considered out-of-date.




Viewing Issue Advanced Details
1698 [Issue 8 drafts] Shell and Utilities Editorial Clarification Requested 2023-05-07 12:27 2023-07-10 10:43
rillig
 
normal  
Applied  
Accepted  
   
Roland Illig
make
3107
104651
Replace 'a target' with 'the target'
Line 104651 says:
> [...] a target [...] a target [...]

This text looks like a declaration of two independent targets, while the intended meaning is that the second 'a target' refers back to the first 'a target'.
In line 104651, replace:
of a target

with:
of the target
There are no notes attached to this issue.




Viewing Issue Advanced Details
1699 [Issue 8 drafts] Shell and Utilities Editorial Clarification Requested 2023-05-07 12:39 2023-07-10 10:46
rillig
 
normal  
Applied  
Accepted As Marked  
   
Roland Illig
make
3108
104679
Note: 0006344
Use term 'rules' consistently
Line 104661 says:
> A makefile can contains rules, macro definitions [...]

Line 104679 says:
> The rules in makefiles shall consist of [...] macro definitions [...] and comments.

These two sentences contradict each other. A macro definition cannot be part of a rule.
In line 104679, replace:
The rules in makefiles

with:
Makefiles


The word 'rules' in line 104677 may need to be changed as well.
Notes
(0006344)
geoffclare   
2023-06-22 16:14   
On page 3107 line 104662 section make, change:
There are two kinds of rules: inference rules and target rules.
to:
There are two kinds of rules: target rules, including special targets (see Target Rules, on page 3110), and inference rules (see Inference Rules, on page 3116).

On page 3108 line 104677 section make, change:
The term makefile is used to refer to any rules provided by the user, whether in ./makefile or its variants, or specified by the −f option.
to:
The term makefile is used to refer to any makefile contents provided by the user, whether in ./makefile or its variants, or specified by the −f option.

(Note to the editor: this text will have moved if bug 1657 has been applied before this one.)

On page 3108 line 104679 section make, delete:
The rules in makefiles shall consist of the following types of lines: target rules, including special targets (see Target Rules, on page 3110), inference rules (see Inference Rules, on page 3116), macro definitions (see Macros, on page 3112), and comments.






Viewing Issue Advanced Details
1701 [Issue 8 drafts] Shell and Utilities Editorial Enhancement Request 2023-05-07 12:50 2023-07-10 10:47
rillig
 
normal  
Applied  
Accepted  
   
Roland Illig
make
3108
104682
Fix spelling of 'Inference Rules'
No need for title case.
In line 104682, replace:
Target and Inference Rules

with:
Target and inference rules
There are no notes attached to this issue.




Viewing Issue Advanced Details
1702 [Issue 8 drafts] Shell and Utilities Editorial Enhancement Request 2023-05-07 12:51 2023-07-10 10:49
rillig
 
normal  
Applied  
Accepted As Marked  
   
Roland Illig
make
3108
104687
Note: 0006345
Consistently use 'following' or 'next'
Line 104687 says:
> the following line

Line 104690 says:
> the next line

Since both lines mean the same, they should use the same wording.
Either always use 'following' or always use 'next'.
Notes
(0006345)
geoffclare   
2023-06-22 16:30   
Change "following line" to "next line".




Viewing Issue Advanced Details
1703 [Issue 8 drafts] Shell and Utilities Editorial Enhancement Request 2023-05-07 13:12 2023-07-18 10:36
rillig
 
normal  
Applied  
Accepted  
   
Roland Illig
ls, make
several
several
Replace at-sign with <commercial-at>
Lines 104749 and 104753 describe structurally same characters but look differently:
> contains an at-sign
> contains a <plus-sign>

Since both refer to characters, they should both use angle quotes.
Section 6.1 'Portable Character Set' lists the official name for '@' as 'commercial-at'.
That name should be used in the few places where the '@' is used in utilities. (I only found it in ls and make.)
Replace 'at-sign' with the standard cross reference for characters, of the form:
<commercial-at> ('@')
There are no notes attached to this issue.




Viewing Issue Advanced Details
1704 [Issue 8 drafts] Shell and Utilities Editorial Clarification Requested 2023-05-07 13:15 2023-07-18 10:47
rillig
 
normal  
Applied  
Accepted  
   
Roland Illig
make
3109
104753
Use simpler wording for '+' prefix character
Line 104753 says:
> If the command prefix contains a <plus-sign>, this indicates a makefile command line that shall be executed even if -n, -q or -t is specified.

Does the phrase 'indicates that' have any intended special meaning? The other prefix characters have clearer, more direct wording.
In line 104753, replace:
this indicates a makefile command line that

with:
the command line

Notes




Viewing Issue Advanced Details
1706 [Issue 8 drafts] Shell and Utilities Editorial Clarification Requested 2023-05-07 13:24 2023-07-18 10:50
rillig
 
normal  
Applied  
Accepted As Marked  
   
Roland Illig
make
3110
104771
See Note: 0006352
Remove 'line that does not begin with <tab>' from the syntax for a rule
The syntax of a target rule does not include the following line, therefore that line should not be written in monospace font. That font wrongly suggests that the following line is part of the rule.
Replace line 104771 with regular text that describes that the target rule ends _before_ the unrelated line.
Notes
(0006352)
Don Cragun   
2023-06-26 15:27   
Delete line 104771.




Viewing Issue Advanced Details
1707 [Issue 8 drafts] Shell and Utilities Editorial Clarification Requested 2023-05-07 13:27 2023-07-18 10:51
rillig
 
normal  
Applied  
Accepted As Marked  
   
Roland Illig
make
3110
104786
See Note: 0006353
Clarify meaning of 'use'
Line 104786 says:
> If the makefile uses this special target, [...]

Does this wording mean that if the makefile merely _defines_ a .DEFAULT target but doesn't actually use it, the application need not 'ensure that it is specified with commands, but without prerequisites'?
Reword line 104786 to avoid the above uncertainty.
Notes
(0006353)
nick   
2023-06-26 15:33   
(edited on: 2023-06-26 15:35)
Page 3110 line 104786 Replace

If the makefile uses this special target

with

If the makefile contains this special target





Viewing Issue Advanced Details
1708 [Issue 8 drafts] Shell and Utilities Editorial Enhancement Request 2023-05-07 13:30 2023-07-18 10:53
rillig
 
normal  
Applied  
Accepted As Marked  
   
Roland Illig
make
3110
104792 and related
See Note: 0006354
Use consistent wording for special targets
Line 104791 says:
> Subsequent occurrences of .IGNORE shall add to the list of targets

Line 104798 says:
> Subsequent occurrences of .PHONY shall also apply these rules to the additional targets.

These two sentences seem to intend the same effect, therefore they should use the same wording.
Use the same wording for all special targets.
Notes
(0006354)
nick   
2023-06-26 15:43   
On page 3110 line 104799 replace

shall also apply these rules to the additional targets.

with

shall add to the list of phony targets.





Viewing Issue Advanced Details
1709 [Issue 8 drafts] Shell and Utilities Editorial Omission 2023-05-07 13:54 2023-07-18 10:55
rillig
 
normal  
Applied  
Accepted As Marked  
   
Roland Illig
make
3112
104860
Note: 0006356
Specify handling of '#' in macro definitions
The section about macro definitions seems unorganized.

For example, the handling of '#' characters is only specified for the '?=' macro definition operator but not for the 5 other operators.

There should be a general introduction before definiting the specific assignment operators. The handling of '#' characters and white-space characters around the macro name, the operator and the macro value should take place in this general introduction.

     Macro definitions have the form 'name op value', where:

           name    is a single-word macro name,

           op      is one of the macro definition operators described
                   below, and

           value   is interpreted according to the macro definition operator.


As a reader, I also wonder why 'string1' is named so unspecifically instead of using 'name' for it. On the other hand, I understand that 'string2' is interpreted differently for each operator, thus the name 'value' wouldn't fit perfectly.
Reorganize the section 'Macros' to have the general parts first, followed by the operator-specific parts.
Notes
(0006356)
geoffclare   
2023-06-26 15:51   
Delete lines 104882-104885.




Viewing Issue Advanced Details
1710 [Issue 8 drafts] Shell and Utilities Editorial Omission 2023-05-07 14:00 2023-07-18 10:57
rillig
 
normal  
Applied  
Accepted As Marked  
   
Roland Illig
make
3113
104930
Note: 0006364
Evaluate delayed-expansion macros in '!=' macro definition
Line 104930 lists the situations in which delayed-expansion macros are expanded. It is missing the right-hand side of a '!=' macro definition, as well as the right-hand side of a '+=' macro definition with an immediate-expansion macro on the left-hand side.
Add the missing cases explicitly, or come up with a concise definition for all situations in which delayed-expansion macros have to be expanded.
Notes
(0006364)
geoffclare   
2023-06-27 09:26   
Proposed new resolution ...

On page 3113 line 104928 change:
Delayed-expansion macros after the <equals-sign> in a macro definition shall not be evaluated until the defined macro is used in a rule or command, or before the <equals-sign> in a macro definition.
to:
Delayed-expansion macros after the <equals-sign> in macro definitions other than the :::=, !=, and += forms, and after the <equals-sign> in += form macro definitions where the macro named by string1 exists and is a delayed-expansion macro, shall only be evaluated when the defined macro is expanded.




Viewing Issue Advanced Details
1711 [Issue 8 drafts] Shell and Utilities Editorial Clarification Requested 2023-05-07 14:08 2023-07-18 10:59
rillig
 
normal  
Applied  
Accepted As Marked  
   
Roland Illig
make
3113
104934
Note: 0006370
Reword 'as described above'
The specification of macro definitions contains the phrase 'as described above' several times. The wording is not perfectly clear about whether 'as described above' really only means 'above' or whether it actually means 'as described anywhere in this section'.

For example, a strict word-by-word interpretation of line 104934 would mean that in the macro definition
${NAME:%a=%b}=value

the '%' would not be interpreted as pattern matching placeholder, as the pattern matching is defined further below, starting in line 104943.
If intended, reword 'as described above' to 'as described in this section'.
Notes
(0006370)
geoffclare   
2023-06-29 15:17   
On page 3113 line 104934, and
page 3114 line 104941, change:
that inner macro expansion shall be performed as described above and the result substituted into string1
to:
that inner macro expansion shall be performed first and the result substituted into string1




Viewing Issue Advanced Details
1712 [Issue 8 drafts] Shell and Utilities Editorial Clarification Requested 2023-05-07 14:14 2023-07-18 11:00
rillig
 
normal  
Applied  
Accepted As Marked  
   
Roland Illig
make
3114
104947
Note: 0006371
Specify case-sensitivity either everywhere or nowhere
Line 104947 says that the [op]%[os]=[np][%][ns] pattern matches in a case-insensitive manner.

The similar line 104936 does not mention case sensitivity, leaving the reader unsure about the case sensitivity there.
Remove the remark about case sensitivity, as it is more confusing than helpful.

Alternatively, mention case sensitivity issues everywhere else as well, for example in macro names, target names, pattern matching, word expansion.
Notes
(0006371)
geoffclare   
2023-06-29 15:30   
On page 3113 line 104947, change:
completely matches, in a case-sensitive manner, the
to:
completely matches the




Viewing Issue Advanced Details
1714 [Issue 8 drafts] Shell and Utilities Editorial Clarification Requested 2023-05-07 14:22 2023-07-18 11:02
rillig
 
normal  
Applied  
Accepted As Marked  
   
Roland Illig
make
3114
104939, 104946
Note: 0006373
Use consistent wording for splitting a macro value into fields/words
The lines 104939 try to say the same but use different words for it:

Line 104939 says:
> where a word, in this context, is defined to be a string delimited by the beginning of the line, a <blank>, or a <newline>

(This definition wrongly assumes that macro values are still related to the lines they originate from.)

Line 104946 says:
> each white-space-separated word
In line 104939, replace the complicated definition with the simpler 'each white-space-separated word', with no intended change in meaning.
Notes
(0006373)
geoffclare   
2023-06-29 15:49   
On page 3114 line 104939, change:
by the beginning of the line
to:
by the beginning of the value




Viewing Issue Advanced Details
1715 [Issue 8 drafts] Shell and Utilities Editorial Enhancement Request 2023-05-07 14:26 2023-07-18 11:04
rillig
 
normal  
Applied  
Accepted  
   
Roland Illig
make
3114
104964
Remove redundant words
Assuming that 'shall be considered to be' actually means 'shall be', it is redundant.
Replace:
the first one shall be considered to be the separator

with:
the first one shall be the separator
There are no notes attached to this issue.




Viewing Issue Advanced Details
1716 [Issue 8 drafts] Shell and Utilities Editorial Enhancement Request 2023-05-07 14:31 2023-07-18 11:07
rillig
 
normal  
Applied  
Accepted As Marked  
   
Roland Illig
make
3117
105071
See Note: 0006375
Inconsistent markup for 'target'
Line 105071 says:
> how to build target from target.s2.

The word 'target' occurs twice, in the same syntactical function. Nevertheless, the first occurrence is in italic, the second is in bold.
Use the same markup for both occurrences of 'target'.
Notes
(0006375)
Don Cragun   
2023-06-29 16:19   
Change on P3117, L105071 in make extended description:
build target from target.s2
to:
build target from target.s2



Throughout the Inference Rules section, change all occurrences of s1 or s1 to s1, and s2 or s2 to s2




Viewing Issue Advanced Details
1719 [Issue 8 drafts] Shell and Utilities Editorial Clarification Requested 2023-05-07 14:41 2023-08-08 10:51
rillig
 
normal  
Applied  
Accepted  
   
Roland Illig
make
3118
105142
Re-evaluate 'shall be unspecified'
Line 105142 says:
> The meaning of the $< macro shall be otherwise unspecified.

The use of the word 'shall' suggests that implementations must not document or specify the meaning of '$<' in other contexts.
Replace 'shall' with 'is'.
There are no notes attached to this issue.




Viewing Issue Advanced Details
1720 [Issue 8 drafts] Shell and Utilities Editorial Enhancement Request 2023-05-07 14:48 2023-08-08 10:54
rillig
 
normal  
Applied  
Accepted  
   
Roland Illig
make
3122
105283
Remove 'should' for 'makefile authors'
Line 105283 says:
> Although make expands macros that do not exist to an empty string, makefile authors should be aware that it is not safe to assume that a macro which has not intentionally been set to a specific value will expand to an empty string for everyone who uses the makefile.

The part 'makefile authors should be aware that' does not add anything to the sentence.
Remove 'makefile authors should be aware that'.
Notes




Viewing Issue Advanced Details
1721 [Issue 8 drafts] Shell and Utilities Editorial Enhancement Request 2023-05-07 14:52 2023-08-08 10:56
rillig
 
normal  
Applied  
Accepted  
   
Roland Illig
make
3125
105452
Add missing numbering
The treatment of escape <newline> characters is not related to the archive interface, therefore it should get its own numbering item.
Before line 105452, add '7.'.
There are no notes attached to this issue.




Viewing Issue Advanced Details
1722 [Issue 8 drafts] Shell and Utilities Editorial Enhancement Request 2023-05-07 15:11 2023-08-08 11:05
rillig
 
normal  
Applied  
Accepted As Marked  
   
Roland Illig
make
3103-3136
several
See Note: 0006379.
Typos
.
In line 104664, replace 'both types' with 'both kinds'. (See line 104662.)

In line 104955 and several others, replace 'right hand side' with 'right-hand side' if you consider 'right-hand' to be an adjective. Same for 'left hand side'.

In line 105109, replace 'all of the following' with 'all of', to align with line 105104.

In line 105114, replace 'The $@ shall' with 'The $@ macro shall', to align with line 105120.

In line 105128, replace 'the list of prerequisites of the target contain' with 'the list of prerequisites of the target contains'.

In line 105199, replace 'DOUBLE SUFFIX RULES' with 'DOUBLE-SUFFIX RULES' if you consider it to be an adjective. Make it consistent with line 105186.

In line 105272, replace 'all of the action' with 'all of the actions'.

In line 105280, replace 'the standard set of default rules use' with 'the standard set of default rules uses'.

In line 105346, replace 'the description' with 'the descriptions'.

In line 105482, replace 'an POSIX' with 'a POSIX'.

In line 105697, replace 'has often been' with 'have often been'.

In line 105706, replace 'used operator' with 'used the operator'.

In line 105731, replace 'influences behavior' with 'influences the behavior'.

In line 105777, consider replacing 'need not' with 'does not need to'.

In line 105778, replace 'on current' with 'on the current'.

In line 105790, replace 'control overall' with 'control the overall'.
Notes
(0006379)
Don Cragun   
2023-07-10 15:44   
Make the changes suggested in the Desired Action except for the change to line 105777.




Viewing Issue Advanced Details
1723 [Issue 8 drafts] Shell and Utilities Editorial Clarification Requested 2023-05-07 15:15 2023-08-08 11:07
rillig
 
normal  
Applied  
Accepted As Marked  
   
Roland Illig
make
3127
105509
Note: 0006380
Re-check whether POSIX make is still a subset of almost all implementations
Line 105509 says:
> Because the syntax specified for the make utility is, by and large, a subset of the syntaxes accepted by almost all versions of make, [...]

If this statement still holds (even after defining immediate-expansion macros, the '::=' and ':::=' macro definition operators, the lazily updated include files, and probably more), mention these features to make the statement less convincing.

If this statement doesn't hold anymore, reword it to reflect reality.
.
Notes
(0006380)
geoffclare   
2023-07-10 15:58   
Change:
Because the syntax specified for the make utility is, by and large, a subset of the syntaxes accepted by almost all versions of make, it was decided that it would be counter-productive to change the name.
to:
Because the syntax specified for the make utility was, by and large, a subset of the syntaxes accepted by almost all versions of make when the original IEEE Std 1003.2-1992 shell and utilities standard was being developed, it was decided that it would be counter-productive to change the name.




Viewing Issue Advanced Details
1725 [1003.1(2016/18)/Issue7+TC2] Shell and Utilities Editorial Clarification Requested 2023-05-09 22:16 2023-08-17 10:55
steffen
 
normal  
Applied  
Accepted  
   
steffen
mailx
3086
103809
---
mailx: *screen*: specify default
This variable is exceptionally not given a default.
Note: page and line number are for 202x/D3, March 2023.

On page 3086, line 103809, change

  of these shall be used.

to

  of these shall be used. The default shall be noscreen.
Notes




Viewing Issue Advanced Details
1726 [Issue 8 drafts] System Interfaces Editorial Clarification Requested 2023-05-10 06:29 2023-08-08 11:09
Florian Weimer
 
normal  
Applied  
Accepted As Marked  
   
Florian Weimer
Red Hat
swbz#178
strlcat
2133
69861
Note: 0006382
strlcat specification is ambiguous regarding return value
A glibc developer tried to implement a hand-written assembler version of strlcat based on the POSIX specification and the OpenBSD manual page, and they were surprised when our test suite flagged their implementation as broken.

Effectively, we test that

  strlcat (buf, src, 0)

is equivalent to:

  strlen (src)

But the specification can be easily read as saying that it should be

  strlen (buf) + strlen (src)

i.e., that it does not matter whether the original contents of the destination buffer contains null bytes or not.
Existing implementations use the buffer size as a bound for the length of the original buffer contents. This is documented fairly explicitly in the Solaris manual page:

“The function returns min{dstsize, strlen(dst)} + strlen(src).”

<https://docs.oracle.com/cd/E36784_01/html/E36874/strlcat-3c.html> [^]

I think the POSIX version should be change so that it is clear that it does not mandate a different behavior. Either it should say explicitly that the return value of strlcat is

  strnlen(dst, dstsize) + strlen(src)

or that strlcat behavior is undefined if there is no null byte among the first dstsize bytes in the buffer at buf.
Notes
(0006382)
geoffclare   
2023-07-10 16:24   
Change:
Upon successful completion, the strlcat() function shall return the initial length of the string pointed to by dst plus the length of the string pointed to by src.
to:
Upon successful completion, the strlcat() function shall return the initial length of the string (if any) pointed to by dst, as limited by dstsize, plus the length of the string pointed to by src; that is, the value that would be returned by strnlen(dst, dstsize) + strlen(src) before the strlcat() call.




Viewing Issue Advanced Details
1727 [Issue 8 drafts] System Interfaces Objection Enhancement Request 2023-05-14 19:23 2023-08-08 11:17
kre
 
normal  
Applied  
Accepted As Marked  
   
Robert Elz
XSH 3 / strptime
2146-7
70221-2, 70250-7
Note: 0006384
strptime() spec needs updates to deal with other changes.
When tm_gmtoff and tm_zone were added to struct tm, the %z and %Z
conversions of strptime() should really have been updated to deal
with them.

And while here, the (CX shaded) %s conversion gives no clue at all
about what effect it might have on the struct tm (unlike say %g %G ^U...)

Around lines 70221-2 (the %s definition) and withinh the CX shading, add
words similar to those used elsewjere (eg: %g)

    The effect of this number, if any, on the tm structure pointed to by
    tm is unspecified.

For %z, lines 70250-2 the statement that is currently there, just like the
ones that should be added for %s, should be deleted (as in the tm_gmtoff
field will now be set to the value parsed by %z).

For %Z. lines 70253-7 the similar statement should also be removed, but this
one probably needs some investigation as to what can be said about what is
done with tm_zone - does it get set to point info the value pointed to by
the buf arg to strptime() or does that get copied - to static storage, or
dynamic, and if the latter, who frees it ? I have no idea. But simply
pretending that tm_zone still doesn't exist cannot be correct.
Notes
(0006384)
geoffclare   
2023-07-13 15:17   
On page 2145 line 70181 section strptime() (%a), append:
The tm_wday member of the tm structure pointed to by tm shall be set to the corresponding day of the week number (Sunday=0).


On page 2146 line 70184 section strptime() (%b), append:
The tm_mon member of the tm structure pointed to by tm shall be set to the corresponding month number.


On page 2146 line 70186 section strptime() (%c), append:
The members of the tm structure pointed to by tm shall be set as specified for the conversions present in the locale's d_t_fmt value.


On page 2146 line 70189 section strptime() (%C), append:
The tm_year member of the tm structure pointed to by tm shall be set to the number formed by appending the last two digits of the year to these digits, minus 1900. If a <tt>y</tt> conversion is also performed, the last two digits of the year shall be those processed by the <tt>y</tt> conversion; otherwise, they shall be 00.


On page 2146 line 70190 section strptime() (%d), append:
The tm_mday member of the tm structure pointed to by tm shall be set to this number.


On page 2146 line 70191 section strptime() (%D), change:
The date as <tt>%m/%d/%y</tt>.

to:
Equivalent to <tt>%m/%d/%y</tt>.


On page 2146 line 70198 section strptime() (%F), append:
The members of the tm structure pointed to by tm shall be set as specified for the <tt>Y</tt>, <tt>m</tt>, and <tt>d</tt> conversions.


On page 2146 line 70209 section strptime() (%H), append:
The tm_hour member of the tm structure pointed to by tm shall be set to this number.


On page 2146 line 70211 section strptime() (%I), append:
If a <tt>p</tt> conversion is also performed, the tm_hour member of the tm structure pointed to by tm shall be set to the hour, by the 24-hour clock, corresponding to the combined results of the <tt>I</tt> and <tt>p</tt> conversions. If a <tt>p</tt> conversion is not also performed, the behavior is unspecified.


On page 2146 line 70213 section strptime() (%j), append:
The tm_yday member of the tm structure pointed to by tm shall be set to this number minus 1.


On page 2146 line 70214 section strptime() (%m), append:
The tm_mon member of the tm structure pointed to by tm shall be set to this number minus 1.


On page 2146 line 70215 section strptime() (%M), append:
The tm_min member of the tm structure pointed to by tm shall be set to this number.


On page 2146 line 70217 section strptime() (%p), append:
If an <tt>I</tt> conversion is also performed, the tm_hour member of the tm structure pointed to by tm shall be set as specified for the <tt>I</tt> conversion; otherwise, the behavior is unspecified.


On page 2146 line 70219 section strptime() (%r), append:
The members of the tm structure pointed to by tm shall be set as specified for the conversions present in the locale's t_fmt_ampm value.


On page 2146 line 70220 section strptime() (%R), change:
The time as <tt>%H:%M</tt>.

to:
Equivalent to <tt>%H:%M</tt>.


On page 2146 line 70222 section strptime() (%s), add a sentence:
The effect of this number, if any, on the tm structure pointed to by tm is unspecified.

and remove the CX shading from the description of the s conversion.

On page 2147 line 70223 section strptime() (%S), append:
The tm_sec member of the tm structure pointed to by tm shall be set to this number.


On page 2147 line 70225 section strptime() (%T), change:
The time as <tt>%H:%M:%S</tt>.

to:
Equivalent to <tt>%H:%M:%S</tt>.


On page 2147 line 70226 section strptime() (%u), append:
The tm_wday member of the tm structure pointed to by tm shall be set to this number modulo 7.


On page 2147 line 70233 section strptime() (%w), append:
The tm_wday member of the tm structure pointed to by tm shall be set to this number.


On page 2147 line 70237 section strptime() (%x), append:
The members of the tm structure pointed to by tm shall be set as specified for the conversions present in the locale's d_fmt value.


On page 2147 line 70238 section strptime() (%X), append:
The members of the tm structure pointed to by tm shall be set as specified for the conversions present in the locale's t_fmt value.


On page 2147 line 70238 section strptime() (%y), change:
When format contains neither a <tt>C</tt> conversion specifier nor a <tt>Y</tt> conversion specifier, values in the range ...

to:
If a <tt>C</tt> conversion is not also performed, values in the range ...


On page 2147 line 70243 section strptime() (%y), append:
If a <tt>C</tt> conversion is also performed, the tm_year member of the tm structure pointed to by tm shall be set as specified for the <tt>C</tt> conversion; otherwise, the tm_year member shall be set to the calculated year minus 1900.


On page 2147 line 70249 section strptime() (%Y), append:
The tm_year member of the tm structure pointed to by tm shall be set to this number minus 1900.


On page 2147 line 70251 section strptime() (%Z), change:
If this name matches tzname[1], and tzname[0] and tzname[1] differ, then the tm_isdst field of the tm structure pointed to by tm shall be set to 1. Otherwise, if this name matches tzname[0] then the tm_isdst field of the tm structure pointed to by tm shall be set to 0. Any other effects on the tm structure pointed to by tm are unspecified.

to:
If this name matches the name pointed to by tzname[1], and the names pointed to by tzname[0] and tzname[1] differ, then the tm_isdst member of the tm structure pointed to by tm shall be set to 1. Otherwise, if this name matches the name pointed to by tzname[0] then the tm_isdst member of the tm structure pointed to by tm shall be set to 0. The tm_zone and tm_gmtoff members of the structure may also be set in an unspecified manner. Members other than tm_isdst, tm_zone, and tm_gmtoff may be affected if an <tt>s</tt> conversion is also performed but shall otherwise not be affected.


On page 2149 line 70311 section strptime(), change:
If a match is found, values for the appropriate tm structure members are set to values corresponding to the locale information.

to:
If a match is found, values for the affected tm structure members are set as specified in the description of the conversion specification.


After page 2150 line 70349 section strptime() (APPLICATION USAGE), add:
The effect of the <tt>s</tt> conversion is unspecified because existing implementations differ in behavior. Some do a conversion equivalent to gmtime(), ignoring all available timezone information; some do a conversion equivalent to localtime(), using the same timezone it would use and ignoring any timezone information provided by a <tt>z</tt> or <tt>Z</tt> conversion. Although none has been observed, there may be existing (or future) implementations that use timezone information provided by a <tt>z</tt> or <tt>Z</tt> conversion, although using the latter would not be reliable as timezone names are often ambiguous. Applications that need to convert a seconds since the Epoch value to a tm structure should call gmtime() or localtime() (or their thread-safe equivalents) directly.

The effect of the <tt>z</tt> conversion is unspecified because existing implementations differ in behavior. Some just use it to set the tm_gmtoff member of the tm structure; some use the value to adjust the other field members to represent UTC, convert the resulting time to a seconds since the Epoch value, and then convert back to a tm structure by the equivalent of localtime(). An application that needs either of these behaviors should perform the necessary processing explicitly itself.

Although the <tt>Z</tt> conversion might be expected to set the tm_zone member of the tm structure, no existing implementation has been found that sets it. Applications that need it set should set it explicitly after calling strptime().




Viewing Issue Advanced Details
1728 [1003.1(2016/18)/Issue7+TC2] System Interfaces Editorial Error 2023-05-18 08:56 2023-08-17 10:56
geoffclare
 
normal  
Applied  
Accepted  
   
Geoff Clare
The Open Group
getprotobyname()
1076
36526
---
typo on the getprotobyname() pointer page
The getprotobyname() pointer page has a typo in NAME section.
Change:
getprotent
to:
getprotoent

There are no notes attached to this issue.




Viewing Issue Advanced Details
1729 [1003.1(2016/18)/Issue7+TC2] System Interfaces Objection Clarification Requested 2023-05-18 09:30 2023-08-17 10:58
geoffclare
 
normal  
Applied  
Accepted  
   
Geoff Clare
The Open Group
mkdir()
1317
43825-6
---
mkdir() ENOENT and ENOTDIR overlap
For a call such as:
mkdir("regular_file/foo", mode)
the descriptions of ENOENT and ENOTDIR on the mkdir() page both apply:
[ENOENT]
A component of the path prefix specified by path does not name an existing directory ...
[ENOTDIR]
A component of the path prefix names an existing file that is neither a directory nor a symbolic link to a directory.
For other file creation functions, e.g. open(), mkfifo(), and fopen(), ENOENT uses "existing file" instead of "existing directory", avoiding this problem.

Also, strictly speaking path does not specify a path prefix (it specifies a complete pathname). Again, other pages don't have this problem, as they refer to "the path prefix of path".
Change:
A component of the path prefix specified by path does not name an existing directory
to:
A component of the path prefix of path does not name an existing file

There are no notes attached to this issue.




Viewing Issue Advanced Details
1730 [1003.1(2016/18)/Issue7+TC2] Shell and Utilities Editorial Error 2023-05-18 09:43 2023-08-17 10:59
geoffclare
 
normal  
Applied  
Accepted  
   
Geoff Clare
The Open Group
m4
2933
97034
---
m4 synopsis does not show file operand as optional
The m4 synopsis is:
m4 [-s] [-D name[=val]]... [-U name]... file...
with no square brackets around "file..." that would indicate it is optional.

However, the description of the file operand and the STDIN section both state that if no file operand is given, the standard input is used.
Change:
file...
to:
[file...]

There are no notes attached to this issue.




Viewing Issue Advanced Details
1731 [1003.1(2016/18)/Issue7+TC2] System Interfaces Objection Clarification Requested 2023-05-23 09:43 2023-08-17 11:01
geoffclare
 
normal  
Applied  
Accepted As Marked  
   
Geoff Clare
The Open Group
pthread_sigmask()
1734
56243
---
See Note: 0006327.
pthread_sigmask() pending signal requirement time paradox
In this statement:
If there are any pending unblocked signals after the call to sigprocmask(), at least one of those signals shall be delivered before the call to sigprocmask() returns.
the normal interpretation of "after the call" would be after the call returns, but that obviously can't be what is intended here because it states an action to be performed before the call returns, which would result in a time paradox.
After applying bug 1636, change:
If there are any pending unblocked signals after the call to pthread_sigmask(), at least one of those signals shall be delivered before the call to pthread_sigmask() returns.
to:
If the argument set is not a null pointer and the change made to the currently blocked set of signals causes any pending signals to become unblocked, at least one of those signals shall be delivered before the call to pthread_sigmask() returns.

Notes
(0006327)
geoffclare   
2023-06-13 09:47   
New suggested resolution ...

After applying bug 1636, change:
If there are any pending unblocked signals after the call to pthread_sigmask(), at least one of those signals shall be delivered before the call to pthread_sigmask() returns.
to:
If the argument set is not a null pointer, after pthread_sigmask() changes the currently blocked set of signals it shall determine whether there are any pending unblocked signals; if there are any, then at least one of those signals shall be delivered before the call to pthread_sigmask() returns.

On page 1736 line 56316 section pthread_sigmask(), change APPLICATION USAGE from:
None.
to:
Although pthread_sigmask() has to deliver at least one of any pending unblocked signals that exist after it has changed the currently blocked set of signals, there is no requirement that the delivered signal(s) include any that were unblocked by the change. If one or more signals that were already unblocked become pending (see [xref to 2.4.1]) during the period the pthread_setmask() call is executing, the signal(s) delivered before the call returns might include only those signals.




Viewing Issue Advanced Details
1732 [1003.1(2016/18)/Issue7+TC2] Shell and Utilities Objection Error 2023-05-23 13:50 2023-09-05 11:00
geoffclare
 
normal  
Applied  
Accepted As Marked  
   
Geoff Clare
The Open Group
cp, mv
2609, 3020
84793, 100451
Approved
See Note: 0006410.
cp and mv EXIT STATUS does not account for -i
The description of cp exit status 0 is:
All input files were copied successfully.
and for mv it is:
All input files were moved successfully.
These do not take into account a "no" answer to a prompt when -i is used.

Compare with rm:
Each directory entry was successfully removed, unless its removal was canceled by a non-affirmative response to a prompt for confirmation.

On page 2609 line 84793 section cp, change:
All input files were copied successfully.
to:
Each file was successfully copied, unless copying it was canceled by a non-affirmative response to a prompt for confirmation.

On page 3020 line 100451 section mv, change:
All input files were moved successfully.
to:
Each file was successfully moved, unless moving it was canceled by a non-affirmative response to a prompt for confirmation.

Notes
(0006410)
Don Cragun   
2023-07-31 15:26   
Interpretation response
------------------------
The standard is unclear on this issue, and no conformance distinction can be made between alternative implementations based on this. This is being referred to the sponsor.

Rationale:
-------------
The cp and mv utility's EXIT STATUS sections did not properly account for the interaction with the -i option.

Notes to the Editor (not part of this interpretation):
-------------------------------------------------------

On page 2609 line 84793 section cp, change:
All input files were copied successfully.
to:
All requested files (excluding files where a non-affirmative response was given to a request for confirmation) were successfully copied.



On page 3020 line 100451 section mv, change:
All input files were moved successfully.
to:
All requested files (excluding files where a non-affirmative response was given to a request for confirmation) were successfully moved.



On page 3200, line 107342-107343 section rm, change:
Each directory entry was successfully removed, unless its removal was canceled by a non-affirmative response to a prompt for confirmation.
to:
All requested directory entries (excluding directory entries where a non-affirmative response was given to a request for confirmation) were successfully deleted.




Viewing Issue Advanced Details
1733 [Issue 8 drafts] Shell and Utilities Objection Error 2023-05-31 16:51 2023-07-06 10:46
ajosey
 
normal  
Applied  
Accepted  
   
Andrew Josey
IEEE/I-1
make
3013
104453
XCU make synopsis (IEEE/I-1)
The updates to add the -j option to the standard are not reflected in the synopsis. This request is associated with the change requested in https://www.austingroupbugs.net/view.php?id=1652 [^] and the change there as well as this change are both required.
Change "[-f makefile]..." to "[-f makefile]... [-j maxjobs]"
Notes




Viewing Issue Advanced Details
1734 [Issue 8 drafts] Shell and Utilities Objection Error 2023-05-31 16:54 2023-08-08 11:19
ajosey
 
normal  
Applied  
Accepted As Marked  
   
Andrew Josey
IEEE/I-2
mkdtemp
1410
47358
See Note: 0006303
XSH mkdtemp function errors (IEEE/I-2)
The EILSEQ error is in the mkdtemp() shall fail section of the errors, but all three of the functions in this section have a template argument and the same problem should be required to be detected in the mkostemp() and mkstemp() functions as well as in the mkdtemp() function. There is also a missing period at the end of this error.

Move P1410, L47358-47359 to appear after L47353 and add a period to the end of the moved text.

Notes
(0006303)
Don Cragun   
2023-06-01 17:51   
(edited on: 2023-06-02 13:26)
Re Note: 0006297: Geoff, you're correct on both of your points.

Looking more closely at this, I should have noted that the error handling for mkdtemp() should be handled the same way as the errors for mkostemp() and mkstemp(). And there is a missing period at the end of an EILSEQ error, but it is on the mkdir() page instead of on the mkdtemp() page.

Consider replacing the entire Desired Action to the following:

Add a period to the end of the EILSEQ error condition for mkdir() on P1405, L47222.

Delete P1410, L47355-47378 (ERRORS for mkdtemp()).

Add a new sentence to the start of the paragraph on P1410, L47381 (ERRORS for mkdtemp()):
Additional error conditions for the mkdtemp( ) function are defined in mkdir( ).






Viewing Issue Advanced Details
1735 [Issue 8 drafts] Shell and Utilities Objection Error 2023-05-31 16:56 2023-08-08 11:21
ajosey
 
normal  
Applied  
Accepted As Marked  
   
Andrew Josey
IEEE/I-3
XRAT
3698
127098
Note: 0006385
XRAT Removed Functions in Issue 8 table. (IEEE/I-3)
The fattach() function was in Issue 7 and has been removed from Issue 8, but it is not listed in the table of removed functions.


Add fattach() to the table on P3698, L127093.


Notes
(0006298)
geoffclare   
2023-06-01 09:43   
This issue affects more than just fattach(). The list in the table was copied from the editing instructions in bug 0001330. However, those instructions did not explicitly name functions to be removed that were part of the STREAMS and Tracing options, as their removal was covered by the instruction to remove everything that was part of those options.

I see three ways to remedy the problem:

1. State in the text that all functions that were part of the STREAMS and Tracing options were removed, thus making it clear that the table only lists the other removed functions.

2. Add an entry to the table along the lines of "All functions that were part of the STREAMS and Tracing options".

3. Add entries to the table for all the individual STREAMS and Tracing functions.
(0006385)
geoffclare   
2023-07-13 15:45   
Make the changes suggested by Note: 0006298 remedy 3 (with the added functions in alphabetic order in the resulting updated table).

The STREAMS function to add are:
fattach()
fdetach()
getmsg()
getpmsg()
ioctl()
isastream()
putmsg()
putpmsg()

The Tracing functions to add are those listed on the <trace.h> page in POSIX.1-2017.




Viewing Issue Advanced Details
1736 [Issue 8 drafts] Shell and Utilities Objection Error 2023-05-31 16:57 2023-08-08 11:22
ajosey
 
normal  
Applied  
Accepted  
   
Andrew Josey
IEEE/I-4
XRAT
3697
127040
See Desired Action
XRAT B.1.1 (IEEE/I-4)
The title on this line is "New Features in Issue 7", but this is not a list of changes that were made in Issue 7; it is a list of changes made in Issue 8.



Change "New Features in Issue 7" to "New Features in Issue 8".



There are no notes attached to this issue.




Viewing Issue Advanced Details
1737 [Issue 8 drafts] Front Matter Editorial Omission 2023-05-31 17:02 2023-07-13 15:51
ajosey
 
normal  
Applied  
Accepted As Marked  
   
Andrew Josey
IEEE/I-5 thru I-30
Contents
xxxii
0
Note: 0006310
Table of contents missing items (IEEE/I-5 thru I-30)
List of Figures is missing
Reference to table 3-1 missing in Contents/List of Tables
Reference to table 3-2 missing in Contents/List of Tables
Reference to table 3-3 missing in Contents/List of Tables
Reference to table 3-4 missing in Contents/List of Tables
Reference to table 3-5 missing in Contents/List of Tables
Reference to table 3-6 missing in Contents/List of Tables
Reference to table 3-7 missing in Contents/List of Tables
Reference to table 3-8 missing in Contents/List of Tables
Reference to table 3-9 missing in Contents/List of Tables
Reference to table 3-10 missing in Contents/List of Tables
Reference to table 3-11 missing in Contents/List of Tables
Reference to table 3-12 missing in Contents/List of Tables
Reference to table 3-13 missing in Contents/List of Tables
Reference to table 3-14 missing in Contents/List of Tables
Reference to table 3-15 missing in Contents/List of Tables
Reference to table 3-16 missing in Contents/List of Tables
Reference to table 3-17 missing in Contents/List of Tables
Reference to table 3-18 missing in Contents/List of Tables
Reference to table 3-19 missing in Contents/List of Tables
Reference to table 3-20 missing in Contents/List of Tables
Reference to table 3-21 missing in Contents/List of Tables
Reference to table 3-22 missing in Contents.List of Tables
Reference to table 3-23 missing in Contents/List of Tables
Reference to figure 3-1 missing in Contents/List of Figures
Reference to figure B-1 missing in Contents/List of Figures
Add List of Figures to Contents after List of Tables
Add line for Table 3-1 "Expressions in Decreasing Precedence in awk" on page 2591
Add line for Table 3-2 "Escape Sequences in awk" on page 2599
Add line for Table 3-3 "Operators in bc" on page 2539
Add line for Table 3-4 "Programming Environments: Type Sizes" on page 2658
Add line for Table 3-5 "Programming Environments: c17 Arguments" on page 2659
Add line for Table 3-6 "Threaded Programming Environment: c17 Arguments" on page 2659
Add line for Table 3-7 “Compression algorithms, −m option-argument values, and suffixes” on page 2719
Add line for Table 3-8 "ASCII to EBCDIC Conversion" on page 2762
Add line for Table 3-9 "ASCII to IBM EBCDIC Conversion" on page 2763
Add line for Table 3-10 "File Utility Output Strings" on page 2911
Add line for Table 3-11 "Table Size Declarations in lex" on page 3015
Add line for Table 3-12 "Escape Sequences in lex" on page 3017
Add line for Table 3-13 "ERE Precedence in lex" on page 3017
Add line for Table 3-14 "Named Characters in od" on page 3201
Add line for Table 3-15 "ustar Header Block" on page 3240
Add line for Table 3-16 "ustar mode Field" on page 3241
Add line for Table 3-17 "Octet-Oriented cpio Archive Entry" on page 3244
Add line for Table 3-18 "Values for cpio c_mode Field" on page 3245
Add line for Table 3-19 "Variable Names and Default Headers in ps" on page 3285
Add line for Table 3-20 "Control Character Names in stty" on page 3376
Add line for Table 3-21 "Circumflex Control Characters in stty" on page 3377
Add line for Table 3-22 "uuencode Base64 Values" on page 3477
Add line for Table 3-23 "Internal Limits in yacc" on page 3590
Add line for Figure 3-1 "pax Format Archive Example" on page 3235
Add line for Figure B-1 "Example of a System with Typed Memory" on page 3741
Notes
(0006310)
geoffclare   
2023-06-05 14:41   
This was traced to a bug/feature in the macro used to generate the table of contents. It has been fixed and the latest gitlab build now has all of the entries that were requested here (although it puts Figures before Tables).




Viewing Issue Advanced Details
1740 [Issue 8 drafts] Base Definitions and Headers Objection Error 2023-05-31 17:12 2023-08-08 11:25
ajosey
 
normal  
Applied  
Accepted As Marked  
   
Andrew Josey
ISO/US-002
7.3.2.4
na
na
Note: 0006389
LC_COLLATE NUL (ISO/US-002)
In Volume 1, under 7.3.2.4, the definition of LC_COLLATE does not preclude the possibility that NUL is not at the beginning of the collation ordering. There are no C interfaces defined by POSIX that allows observation of such ordering of NUL; however, such ordering appears to be observable via the string comparison facility of the `test` utility in environments that both treat the utility as intrinsic and allow NUL characters in shell strings (perhaps via command substitution).

The C++ standard library specifies interfaces that would allow observation of such ordering of NUL; however, the lack of standardized C interfaces with such capability means that C++ standard library implementations suffer in terms of quality or portability. localedef is a POSIX facility that serves as a source for locales with exotic sorting of NUL, so it seems within the scope of POSIX to declare that sorting under locales where NUL does not sort as the least value is subject to limitations.


Specify that placing NUL in the collation order in any position other than the first need not succeed in all contexts.
ISO_IEC CD 9945 Collated Comments.doc (26 KB) 2023-05-31 17:12
Notes
(0006389)
geoffclare   
2023-07-17 16:28   
On page 140 line 4644 section 7.3.2, after:
Note: Users installing their own locales should ensure that they define a collation sequence with a total ordering of all characters unless an '@' modifier in the locale name (such as @icase) indicates that it has a special collation sequence.
add:
As <NUL> is reserved as the string terminator for most usages of LC_COLLATE, it is the responsibility of the locale writer to ensure <NUL> has the lowest primary weight in a collation ordering for the interfaces to behave in the way users typically expect. Unusual behavior may result if it has any other collation order weighting, or is subject to IGNORE.

On page 144 line 4795 section 7.3.2.4, after:
order_start  forward;backward
add:
<NUL>        <NUL>;<NUL>




Viewing Issue Advanced Details
1741 [Issue 8 drafts] System Interfaces Objection Error 2023-05-31 17:13 2023-08-08 11:29
ajosey
 
normal  
Applied  
Accepted As Marked  
   
Andrew Josey
ISO/US-003
3
1137
38955-38956
Note: 0006395
getlocalename_l (ISO/US-003)
In Volume 2, Chapter 3, the Description for getlocalename_l specifies that using LC_ALL as the category argument for a call shall result in the call being not successful. This restriction in functionality leaves application developers without a portable way to record, into a string usable with setlocale (with LC_ALL), the “name” of the locale represented by the locale object. In particular, “composite” or “mixed” locales using a different locale definition for at least one category have such “names” formed by the implementation, but the format is not uniform across implementations.

The C++ standard library presents a std::locale type with interfaces that can produce such mixed locales in a thread-safe manner. It also includes a std::locale::global function that requires setlocale interaction with such a mixed std::locale. C++ standard library implementations currently suffer in terms of quality or portability from a lack of a C-level, standardized, thread-safe locale interface that will produce a “name” for such mixed locales. As POSIX provides thread-safe locales as an extension to C, the missing functionality seems to be within the scope of POSIX to provide.
Remove the specification that using LC_ALL results in a call being not successful. Specify that using LC_ALL results in “a string which encodes the locale name(s) for all of the individual categories, consistent with setlocale”. Specify that using LC_ALL for the category returns a string that may be invalidated or overwritten by a subsequent call in the same thread with LC_ALL. Update the rationale; an example use case for LC_ALL is application portability in recording the “international environment” even in the face of extensions such as the introduction of extra categories such as LC_TELEPHONE.
ISO_IEC CD 9945 Collated Comments.doc (26 KB) 2023-05-31 17:13
Notes
(0006395)
geoffclare   
2023-07-20 16:19   
Change:
The getlocalename_l() function shall return the locale name for the given locale category of the locale object locobj, or of the global locale if locobj is the special locale object LC_GLOBAL_LOCALE.

The category argument specifies the locale category to be queried. If the value is LC_ALL or is not a supported locale category value (see [xref to setlocale()]), getlocalename_l() shall fail.
to:
If category is not LC_ALL, the getlocalename_l() function shall return the locale name for the given locale category of the locale object locobj, or of the global locale if locobj is the special locale object LC_GLOBAL_LOCALE.

If category is LC_ALL, the getlocalename_l() function shall return a string that encodes the locale settings for all locale categories of the locale object locobj, or of the global locale if locobj is the special locale object LC_GLOBAL_LOCALE, in the same form as is returned by setlocale(). The string returned is such that a subsequent call to setlocale(), from the same process, with a pointer to that string as locale and the LC_ALL category shall set the global locale to the same locale for each category as was present in the queried object.

If the value of the category argument is neither LC_ALL nor a supported locale category value (see [xref to setlocale()]), getlocalename_l() shall fail.


Change APPLICATION USAGE from "None" to:
In addition to the caveats regarding validity of the returned string pointer in RETURN VALUE, the content of the string returned when category is LC_ALL is only required to be valid for the life of the process, so is not intended for storage or sharing between processes. As the internal format of the string is implementation-specific, there is nothing preventing a subsequent run of an application from being presented a different format, for example if the implementation is updated.




Viewing Issue Advanced Details
1743 [Issue 8 drafts] Shell and Utilities Editorial Error 2023-06-08 20:48 2023-08-22 14:27
steffen
 
normal  
Applied  
Accepted  
   
steffen
mailx
3085, 3087
103783, 103869
mailx: revert faulty change
Hello.

Unfortunately i introduced "another fault", another non-portable-across-implementations problem.

It is the *metoo* variable that does not affect `alternates' (command) names in the historical, and the BSD codebases, but _only_ $LOGNAME.
For those, `alternates' are _always_ removed from recipient lists (and _only_ when replying).
(And the *from* and *sender* variables are non-portable.)

(It is also that further mailx(1) development can hardly be expected in portable manner, standardizable manner.)
On page 3085, line 103783 ff., change

  Suppress the deletion of the user’s login name and any alternative addresses from the recipient list when replying to a message or sending to a group. The default shall be nometoo.

to

  Suppress the deletion of the user’s login name from the recipient list when replying to a message or sending to a group. The default shall be nometoo.

On page 3087, line 103869 ff., change

  Declare a list of alternative addresses for the address consisting of the user’s login name. When responding to a message or sending to a group, if the metoo variable is unset these alternative addresses shall be removed from the list of recipients. The comparison of addresses shall be performed in a case-insensitive manner. With no arguments, alternates shall write the current list of alternative addresses.

to

  Declare a list of alternative addresses for the address consisting of the user’s login name. When responding to a message these alternative addresses shall be removed from the list of recipients. The comparison of addresses shall be performed in a case-insensitive manner. With no arguments, alternates shall write the current list of alternative addresses.

Page and line numbers from IEEE P1003.1™-202x/D3, March 2023.
Notes




Viewing Issue Advanced Details
1744 [Issue 8 drafts] Base Definitions and Headers Editorial Enhancement Request 2023-06-10 07:16 2023-08-22 14:28
nrk
 
normal  
Applied  
Accepted  
   
Nickolas Raymond Kaczynski
2.4.3 Signal Actions
517
18354
Explicitly require killpg to be async-signal-safe
Description of killpg() states:

    If pgrp is greater than 1, killpg(pgrp, sig) shall be equivalent to kill(-pgrp, sig).

And since kill() is async-signal-safe, killpg() should be as well but it's not listed as such explicitly.
Explicitly add killpg() to the list of async-signal-safe functions.
There are no notes attached to this issue.




Viewing Issue Advanced Details
1745 [1003.1(2016/18)/Issue7+TC2] Shell and Utilities Objection Clarification Requested 2023-06-13 14:32 2023-09-05 11:02
geoffclare
 
normal  
Applied  
Accepted As Marked  
   
Geoff Clare
The Open Group
tsort
3319
111766
Approved
See Note: 0006408.
tsort input and output format clarifications
The tsort description seems to require applications to supply a single line of input:
The application shall ensure that the input consists of pairs of items (non-empty strings) separated by <blank> characters.
However, the example shows input split over multiple lines.

It appears that most implementations accept any white-space characters as separators, although the GNU coreutils version only accepts blanks and newlines.

Also, it doesn't say whether there can be multiple separator characters between items, which all implementations I tried accept, or that the output is one-item-per-line.
On page 3319 line 111766 section tsort, change:
The application shall ensure that the input consists of pairs of items (non-empty strings) separated by <blank> characters. Pairs of different items indicate ordering. Pairs of identical items indicate presence, but not ordering.
to:
The application shall ensure that the input consists of pairs of items (non-empty strings) separated by one or more <blank> or <newline> characters. It is unspecified whether other white-space characters can also be used as separators. Pairs of different items shall indicate ordering. Pairs of identical items shall indicate presence, but not ordering.

On page 3319 line 111798 section tsort, change:
The standard output shall be a text file consisting of the order list produced from the partially ordered input.
to:
The standard output shall be a text file consisting of the ordered list of items, with one item per line, produced from the partially ordered input.

Notes
(0006408)
Don Cragun   
2023-07-27 16:11   
Interpretation response
------------------------
The standard is unclear on this issue, and no conformance distinction can be made between alternative implementations based on this. This is being referred to the sponsor.

Rationale:
-------------
None.

Notes to the Editor (not part of this interpretation):
-------------------------------------------------------
Make the changes in the Desired Action.




Viewing Issue Advanced Details
1746 [1003.1(2016/18)/Issue7+TC2] Shell and Utilities Objection Clarification Requested 2023-06-13 15:58 2023-09-05 11:05
geoffclare
 
normal  
Applied  
Accepted As Marked  
   
Geoff Clare
The Open Group
fuser
2817
92698
Approved
See Note: 0006406.
fuser output format clarification
As discussed on the mailing list in October 2021, the use of "%d" as the output format for each PID written by fuser allows there to be no separation between multiple PIDs. The standard should require at least one blank before each PID (including the first: its preceding blank(s) separate it from the pathname written to standard error in the case that standard output and standard error are directed to the same file).

The "%d" format also allows the PID to be followed by <blank> characters, but I believe the intention (and current practice) is that the usage character written to standard error immediately follows the PID (when standard output and standard error are directed to the same file).
On page 2817 line 92698 section fuser, change:
"%d"
to:
" %1d"

On page 2817 line 92711 section fuser, change:
When standard output and standard error are directed to the same file, the output shall be interleaved so that the filename appears at the start of each line, followed by the process ID and characters indicating the use of the file. Then, if the -u option is specified, the user name or user ID for each process using that file shall be written.
to:
When standard output and standard error are directed to the same file, the output shall be interleaved so that the pathname written to standard error appears at the start of each line and is immediately followed by the <blank> character(s) and process ID written to standard output, which are immediately followed by the characters written to standard error indicating the use of the file and, if the -u option is specified, the user name or user ID (in parentheses).

Notes
(0006341)
geoffclare   
2023-06-20 15:56   
New proposal that (I hope) addresses all the points raised so far...

On page 2816 line 92653 section fuser (NAME), change:
list process IDs of all processes that have one or more files open
to:
list process IDs of all processes that are using one or more named files

On page 2816 line 92657 section fuser (DESCRIPTION), change:
The fuser utility shall write to standard output the process IDs of processes running on the local system that have one or more named files open. For block special devices, all processes using any file on that device are listed.

The fuser utility shall write to standard error additional information about the named files indicating how the file is being used.

Any output for processes running on remote systems that have a named file open is unspecified.

A user may need appropriate privileges to invoke the fuser utility.
to:
For each file operand, in order, fuser shall write one line of output, some of it to standard output, and the rest to standard error, giving information about processes running on the local system that are using the file. A process shall be considered to be using a file if it has at least one open file descriptor associated with the file or if the file is a directory that is the current working directory or the root directory for the process, and may be considered to be using a file for other implementation-dependent reasons. If file names a block special device that contains a mounted file system, and the -f option is not specified, any processes using any file on that mounted file system and any processes that are using the device file itself shall be listed.

Any output for processes running on remote systems that are using a named file is unspecified.

A user may need appropriate privileges to invoke the fuser utility.

When standard output and standard error are directed to the same file, the output for each file operand shall be interleaved so that it is written to the file in the following order:
  • On standard error, a pathname for the file, immediately followed by a <colon> and zero or more <blank> characters. The pathname shall be either the file operand (unaltered) or the pathname that would result from a successful call to the realpath() function, defined in System Interfaces volume of POSIX.1-202x, with the file operand as its file_name argument.

  • For each process using the file:
    • On standard output, the process ID in the format:
      " %1d", <process ID>

    • On standard error, information about the file's use by the process, in the following format:
      "%s", <use chars>
      if the -u option is not specified, or in the following format:
      "%s(%s)", <use chars>, <user name>
      if the -u option is specified, where <use chars> is a string of zero or more characters indicating the use of the file and <user name> is the user name corresponding to the real user ID of the process or, if the user name cannot be resolved from the real user ID of the process, the real user ID of the process in decimal. The value of <use chars> shall include the character 'c' if the process is using the file as its current directory and the character 'r' if the process is using the file as its root directory; implementations may include other alphabetic characters to indicate other uses of the file.

  • On standard error, a <newline> character.


When standard output and standard error are not directed to the same file, the data written to each shall be as described above but the ordering of writes to standard output relative to writes to standard error is unspecified. For example, fuser might first write the information for all file operands to standard error and then write all of the process IDs to standard output.

On page 2816 line 92667 section fuser (OPTIONS, -c), change:
The file is treated as a mount point and the utility shall report on any files open in the file system.
to:
If a file operand names a directory that is the mount point of a mounted file system, all processes using any file on that file system shall be listed as if they were using the named directory. The behavior for any file operand that names an existing file that is not the mount point of a mounted file system is unspecified.

On page 2816 line 92674 section fuser (OPERANDS), change:
A pathname on which the file or file system is to be reported.
to:
A pathname of a file for which the processes using the file are to be reported.

On page 2817 line 92696-92698 section fuser, replace the STDOUT section with:
See DESCRIPTION.

On page 2817 line 92700-92716 section fuser, replace the STDERR section with:
The fuser utility shall write diagnostic messages to standard error.

The fuser utility also shall write information to standard error as specified in the DESCRIPTION section.

On page 2818 line 92728 section fuser, change APPLICATION USAGE from "None" to:
Things can change while fuser is running; the snapshot it gives is only true for an instant, and might not be accurate by the time it is displayed.

On page 2818 line 92743 section fuser (EXAMPLES), change:
fuser <block device>
writes to standard output the process IDs of processes that are using any file which is on the device named by <block device> and writes to standard error an indication of how those processes are using the file.
fuser -f <block device>
writes to standard output the process IDs of processes that are using the file <block device> itself and writes to standard error an indication of how those processes are using the file.
to:
fuser <mounted block device>
writes to standard output the process IDs of processes that are using any file on the mounted file system contained by <mounted block device> and of processes that are using the device file <mounted block device> itself, and writes to standard error an indication of how those processes are using the files.
fuser -f <mounted block device>
writes to standard output the process IDs of processes that are using the device file <mounted block device> itself and writes to standard error an indication of how those processes are using the file.
(0006406)
Don Cragun   
2023-07-27 16:01   
Interpretation response
------------------------
The standard states that the process ID is written using the format "%d", and conforming implementations must conform to this. However, concerns have been raised about this which are being referred to the sponsor.

Rationale:
-------------
Format "%d" allows, but does not require a space or tab before the process ID. The standard should require separation between process IDs in order for the output to be usable.

Notes to the Editor (not part of this interpretation):
-------------------------------------------------------
Make the changes in Note: 0006341




Viewing Issue Advanced Details
1747 [Issue 8 drafts] Shell and Utilities Editorial Enhancement Request 2023-06-20 18:28 2023-08-22 14:30
steffen
 
normal  
Applied  
Accepted  
   
steffen
mailx
3087
103862
mailx: document alias expansion prevention
All BSD Mail and System V10 mailx codebases support prevention of alias expansion via reverse solidus / backslash.
On page 3087, line 103862 ff., append after

    [.] for example, when hlj is an alias, hlj@posix.com does not trigger
    the alias substitution.

the sentence

    Recursive expansion of an alias group member can be prevented by
    prefixing it with an unquoted <backslash>.
There are no notes attached to this issue.




Viewing Issue Advanced Details
1748 [Issue 8 drafts] Shell and Utilities Editorial Error 2023-06-21 17:24 2023-08-17 11:02
Don Cragun
 
normal  
Applied  
Accepted  
   
Don Cragun
Reported by Lawrence Velázquez in austin-group-l e-mail sequence #36038
Shell command language 2.8.1
2481
80712
Typo "define" s/b "defined".
Missing letter in Issue 7 carried forward into Issue 8 draft 3.
On page 2481, line 80712, change:
define
to:
defined
There are no notes attached to this issue.




Viewing Issue Advanced Details
1749 [Issue 8 drafts] Shell and Utilities Editorial Enhancement Request 2023-06-26 17:17 2023-08-22 14:32
rillig
 
normal  
Applied  
Accepted  
   
Roland Illig
make
3110
104795
Special targets .PHONY and .NOTPARALLEL are not in alphabetic order
.
Move the paragraph starting in line 104795 above line 104810.

If possible, add a source-level annotation "sorted" to that list, which is then checked by a static analysis tool, ensuring that the items are always listed in the correct order.
There are no notes attached to this issue.




Viewing Issue Advanced Details
1750 [Issue 8 drafts] Shell and Utilities Editorial Error 2023-06-26 17:59 2023-08-22 14:35
rillig
 
normal  
Applied  
Accepted  
   
Roland Illig
make
3131
105684
typo: explictily
.
Replace:
explictily
with:
explicitly
There are no notes attached to this issue.




Viewing Issue Advanced Details
1752 [Issue 8 drafts] Base Definitions and Headers Editorial Enhancement Request 2023-06-29 09:25 2023-08-22 14:37
Vincent Lefevre
 
normal  
Applied  
Accepted As Marked  
   
Vincent Lefevre
Inria
<float.h> header
260
9115
Note: 0006400
replace "mantissa" by "significand"
In "the number of mantissa digits", the term "mantissa" is not standard.
The term "significand" should be used instead.

But "the number of digits of the significand" could be less confusing.
Notes
(0006400)
geoffclare   
2023-07-24 16:03   
Change, page 260, line 9115:
the number of mantissa digits
to:
the number of digits in the significand




Viewing Issue Advanced Details
1753 [Issue 8 drafts] System Interfaces Editorial Enhancement Request 2023-06-29 09:27 2023-08-22 14:39
Vincent Lefevre
 
normal  
Applied  
Accepted  
   
Vincent Lefevre
Inria
frexp
1031
35373
replace "mantissa" by "significand"
In "extract mantissa and exponent from a double precision number", the term "mantissa" is not standard.
The term "significand" should be used instead.
There are no notes attached to this issue.




Viewing Issue Advanced Details
1754 [Issue 8 drafts] Base Definitions and Headers Objection Error 2023-06-29 11:15 2023-08-22 14:38
Vincent Lefevre
 
normal  
Applied  
Accepted As Marked  
   
Vincent Lefevre
Inria
<float.h> header
259
9089
Note: 0006401
formula for LDBL_MAX may be incorrect, e.g. for the double-double IBM format
For the maximum representable finite floating-point number, the formula (1−b^(−p))b^emax is given, but this formula is incorrect for LDBL_MAX on PowerPC, where the double-double IBM format is used.

For the reference: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61399 [^]

For the LDBL_MAX issue, this was resolved as a defect in the C standard by DR 467, mentioned here: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2092.htm [^]
Do the same change as in the C standard, i.e. add "if that number is normalized, its value is".

Note that the ISO C23 draft has:
"maximum representable finite floating-point number; if that number is normalized, its value is [the formula]"
Notes
(0006401)
geoffclare   
2023-07-24 16:22   
On page 260 line 9123 section <float.h>, change FUTURE DIRECTIONS from "None" to:
The formula for calculating FLT_MAX, DBL_MAX, and LDBL_MAX is expected to change in the next revision of the ISO C standard such that it only applies if the values are normalized.




Viewing Issue Advanced Details
1755 [Issue 8 drafts] Base Definitions and Headers Objection Error 2023-07-06 08:44 2023-08-22 14:45
geoffclare
 
normal  
Applied  
Accepted  
   
Geoff Clare
The Open Group
1.8.1 Codes
8
196
not deferring to C17 on specifics has knock-on effects
In order not to defer to the C standard for some specific behaviours we have changed the usual boilerplate at the beginning of the DESCRIPTION for affected functions. However, the description of the CX margin code refers to the precise wording of that introductory text, and so needs to be updated. The rationale for that margin code should also explain why the need for these deviations from the C standard has arisen.

In addition, there have been questions raised concerning some wording on the c17 page in XCU which we should clarify in order to ensure it cannot be interpreted as requiring conformance to section 7 (Library) of the C standard.
On page 8 line 196 section 1.8.1 Codes (CX), change:
The functionality described is an extension to the ISO C standard. Application developers may make use of an extension as it is supported on all POSIX.1-202x-conforming systems.

With each function or header from the ISO C standard, a statement to the effect that ``any conflict is unintentional'' is included.
to:
The functionality described is an extension to the ISO C standard or a deviation from it. Application developers can make use of the functionality as it is supported on all POSIX.1-202x-conforming systems.

With each function or header from the ISO C standard, a statement is included to the effect that ``any conflict is unintentional'', or ``any other conflict is unintentional'' if there is an intentional conflict (deviation).

On page 2652 line 87330 section c17, change:
it shall accept source code conforming to the ISO C standard
to:
it shall accept source code written in the C language as defined in section 6 of the ISO C standard

On page 2652 line 87333 section c17, after:
... and a linkage phase, for handling Phase 8 of the ISO C standard and extensions described here.
add a sentence:
The reference to ``library components'' in Phase 8 shall be taken to refer to components of libraries specified using the -l option, libraries specified as file.a or file.so operands, and the equivalent of a -l c option passed to the link editor in the manner specified in the EXTENDED DESCRIPTION.

After page 2660 line 87690 section c17 (APPLICATION USAGE), add a paragraph:
Since this standard requires that conforming applications define either _POSIX_C_SOURCE or _XOPEN_SOURCE before inclusion of any header (see [xref to XSH 2.2.1 POSIX.1 Symbols]), if c17 is used to compile source code that includes a header without defining one of these feature test macros in the required manner, the behavior of c17 itself and the results of using any files it generates are undefined. When c17 is used this way, implementations are encouraged to make visible in headers from the ISO C standard only the symbols that are allowed by that standard, and otherwise to behave the same as if _POSIX_C_SOURCE or _XOPEN_SOURCE had been defined, but portable applications cannot rely on this.

On page 3606 line 123517 section A.1.8.1 Codes (CX), change:
This margin code is used to denote extensions beyond the ISO C standard. For interfaces that are duplicated between POSIX.1-202x and the ISO C standard, a CX introduction block describes the nature of the duplication, with any extensions appropriately CX marked and shaded.
to:
This margin code is used to denote extensions beyond and, in exceptional cases, deviations from the ISO C standard. For interfaces that are duplicated between POSIX.1-202x and the ISO C standard, a CX introduction block describes the nature of the duplication, with any extensions or deviations appropriately CX marked and shaded. Where deviations exist, the reasons for them are explained in the RATIONALE section of the affected interface. Deviations have become necessary because there is no longer any formal way for ISO to acknowledge defects in the ISO C standard. For the original C90 standard and the C99 revision, defect reports (DRs) were issued, but there is no equivalent mechanism for the current revision. Even if the defect is corrected in a later revision, without stating deviations POSIX.1-202x would continue to require the incorrect behavior described in the version of the ISO C standard that it references.

There are no notes attached to this issue.




Viewing Issue Advanced Details
1757 [Issue 8 drafts] Shell and Utilities Editorial Error 2023-07-15 01:59 2023-08-17 11:04
larryv
 
normal  
Applied  
Accepted  
   
Lawrence Velázquez
expr
2895
96621
expr: incorrectly describes BRE subexpression as "[\(...\)]"
The behavior of the expr ':' operator when its pattern operand contains a BRE subexpression is described on page 2895, lines 96620-96623:
Alternatively, if the pattern contains at least one regular expression subexpression "[\(...\)]", the string matched by the back-reference expression "\1" shall be returned. If the back-reference expression "\1" does not match, then the null string shell be returned.
This implies that subexpressions include or must be enclosed between '[' and ']'. However, as per XBD Section 9.3.5, '[' introduces a bracket expression, within which a subexpression cannot occur (since "\(" is not treated specially).

This is the corresponding text from Commands and Utilities, Issue 5, page 337:
Alternatively, if the pattern contains at least one regular expression subexpression [\(...\)], the string corresponding to \1 will be returned.
It seems clear that '[' and ']' were being used in a parenthetical, not literal, capacity, but Issue 6 (https://pubs.opengroup.org/onlinepubs/009695399/utilities/expr.html) [^] missed that and simply wrapped the whole thing in quotation marks. It's been wrong ever since.
Change:
"[\(...\)]"
to:
"\(...\)"
Notes




Viewing Issue Advanced Details
1766 [Issue 8 drafts] System Interfaces Editorial Error 2023-07-18 03:49 2023-08-17 11:05
larryv
 
normal  
Applied  
Accepted  
   
Lawrence Velázquez
catgets
698
24307
catgets: quotation in "Change History" lacks closing quotes
The paragraph on page 698, lines 24304-24307, ends with a quotation that is not closed:
Austin Group Interpretation 1003.1-2001 #044 is applied, changing the ``may fail'' [EINTR] and [ENOMSG] errors to become ``shall fail'' errors, updating the RETURN VALUE section, and updating the DESCRIPTION to note that: ``The results are undefined if catd is not a value returned by catopen() for a message catalog still open in the process.
On page 698, at the end of line 24307, add closing quotes.
Notes




Viewing Issue Advanced Details
1767 [1003.1(2016/18)/Issue7+TC2] Shell and Utilities Editorial Clarification Requested 2023-07-18 17:45 2023-08-22 14:46
mohd_akram
 
normal  
Applied  
Accepted  
   
Mohamed Akram
sed
3219
108018
---
sed: clarify behavior of c (change) command
There's some confusion about the behavior of the sed c (change) command. Most implementations interpret it as starting the next cycle on every line that its address range matches. Meaning, any command after it in the script is not processed. Others interpret "start the next cycle" as only occurring at the end of the address range i.e. that commands following the change command are processed except on the last line of the address range.

The implementations that implement the former behavior are:

- GNU sed
- NetBSD sed
- Busybox sed
- Plan 9 sed

Those that don't are:

- FreeBSD sed
- OpenBSD sed

This has been acknowledged as a bug in FreeBSD.

This was previously corrected in NetBSD with the following explanation, referencing the V7 man page:

    sed(1) "c" command is in some sense a shorthand for "i"+"d", so,
    like "d" it should start the next cycle. This is explicitly documented
    in v7 man page:

      Delete the pattern space. With 0 or 1 address or at the end of a
      2-address range, place text on the output. Start the next cycle.

NetBSD commit: https://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=45981 [^]

FreeBSD bug report: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=271817 [^]
Change:

Delete the pattern space. With a 0 or 1 address or at the end of a 2-address range, place text on the output and start the next cycle.

To:

Delete the pattern space. With a 0 or 1 address or at the end of a 2-address range, place text on the output. Start the next cycle.
Notes




Viewing Issue Advanced Details
1768 [1003.1(2016/18)/Issue7+TC2] System Interfaces Editorial Omission 2023-07-18 23:56 2023-08-17 11:05
larryv
 
normal  
Applied  
Accepted As Marked  
   
Lawrence Velázquez
catgets
648
22290
---
See Note: 0006394.
catgets: incomplete paragraph in "Change History"
Under section "Change History", subsection "Issue 7", the second paragraph ends mid-sentence. In its entirety, it reads:
Austin Group Interpretation 1003.1-2001 #148 is applied, adding
At a bare minimum, on page 648, line 22290, change:
Austin Group Interpretation 1003.1-2001 #148 is applied, adding
to:
Austin Group Interpretation 1003.1-2001 #148 is applied.
This is a complete sentence and paragraph, at least.

Ideally, change:
Austin Group Interpretation 1003.1-2001 #148 is applied, adding
to:
Austin Group Interpretation 1003.1-2001 #148 is applied, adding [whatever was added].
Unfortunately, I don't know anything about this interpretation and thus cannot suggest replacement text.
Notes
(0006394)
geoffclare   
2023-07-20 09:25   
The text of Austin Group Interpretation 1003.1-2001 #148 can be found here:

https://collaboration.opengroup.org/austin/interps/documents/14539/AI-148.txt [^]

It doesn't contain any changes to catgets()!

I wondered whether there might be a typo in the 148 number, but I have been unable to find any other interpretation (besides #044 that is already listed) affecting catgets(), and as far as I can see, all of the changes to catgets() between Issue 6 and Issue 7 are described in the other Change History entries.

The other possibility is that this entry should have been on a different page, but all of the functions affected by #148 have a Change History entry for it.

So I think all we can do at this point is treat this line as spurious and delete it.




Viewing Issue Advanced Details
1769 [Issue 8 drafts] System Interfaces Objection Error 2023-07-24 09:25 2023-08-22 14:48
geoffclare
 
normal  
Applied  
Accepted  
   
Geoff Clare
The Open Group
fputwc()
1009
34638
CX shading needed for fputwc() EILSEQ error indicator requirement
As a result of our liaison with the C committee regarding bug 0001022, C17 added a requirement for the stream error indicator to be set on fgetwc() EILSEQ errors. However, it not do the same for fputwc().

We raised this in a C23 ballot comment and the committee agreed to add the requirement to fputwc() in C23.

Since Issue 8 will reference C17, we need to add CX shading to the requirement; we should also mention the expected C23 change in rationale.
Change:
Otherwise, it shall return WEOF, the error indicator for the stream shall be set, [CX]and errno shall be set to indicate the error[/CX].
to:
Otherwise, it shall return WEOF, [CX]errno shall be set to indicate the error[/CX], and for errors other than [EILSEQ] the error indicator for the stream shall be set; [CX]the error indicator for the stream shall also be set for [EILSEQ] errors.[/CX]

On page 1010 line 34672 section fputwc(), change RATIONALE from "None" to:
The requirement to set the error indicator for the stream on [EILSEQ] errors is CX shaded because the ISO C standard does not require it to be set for fputwc() encoding errors, although it does for fgetwc(). The next revision of the ISO C standard is expected to address this inconsistency by requiring the error indicator for the stream to be set for fputwc() encoding errors.

There are no notes attached to this issue.




Viewing Issue Advanced Details
1770 [1003.1(2016/18)/Issue7+TC2] System Interfaces Editorial Error 2023-07-24 13:52 2023-08-17 11:07
bhaible
 
normal  
Applied  
Accepted  
   
Bruno Haible
GNU
iswblank
1198
40302
---
Typo re iswblank_l
Line 40302 has:
"The iswblank() and iswblank() functions"
What was meant was surely:
"The iswblank() and iswblank_l() functions"
(line in line 40311).
In the CX shaded text, replace iswblank() with iswblank_l().
There are no notes attached to this issue.




Viewing Issue Advanced Details
1771 [Issue 8 drafts] Shell and Utilities Editorial Enhancement Request 2023-08-07 19:22 2023-10-10 09:34
calestyo
 
normal  
Applied  
Accepted As Marked  
   
Christoph Anton Mitterer
Utilities / printf
3269
111019
See Note: 0006470
support or reserve %q as printf-utility format specifier
Hey.

Hope that hasn't been asked for before and rejected, at least I couldn't find anything about it.

I propose considering to either support (in the sense of: conforming implementations must support it) or reserve (in the sense of: *if* a conforming implementation uses that format specifier, *then* it must be for the given purpose) the %q format specifier for the printf-utility as used e.g. by GNU printf or bash.

bash specifies it as:
> %q causes printf to output the corresponding argument in a
> format that can be reused as shell input.

GNU printf as:
> %q ARGUMENT is printed in a format that can be reused as shell in‐
> put, escaping non-printable characters with the proposed POSIX
> $’’ syntax.

I'm afraid I don't have access to that many other shells like from BSDs or so, but from what I saw after looking at some manpages, %q seems not to be generally supported, although not used for other purposes either.
(There are experts around here, who can probably tell much better about this.)


I'm also not sure how exactly it should be implemented:

1) One way would be to allow any quoting that is understood by the given shell,... this would for example allow $'...' quotes to be used, even now where they're (not yet) supported by POSIX. And of course any other custom quoting styles used by a shell might then be produced by that.

I guess it's obvious that (1) would be rather bad with respect to interoperability.


2) The better way would IMO be to allow only any quoting styles supported by POSIX.
An open question, to which I have no proper answer, would be whether any preference should be given or perhaps even any obligations should be made (like newline needed to being given as $'\n' or the like, but not literally).


The motivation for supporting %q is that it's IMO generally useful to have an out-of-the-box mechanism to safely quote input so that it may be reused as shell input.
see above
Notes
(0006470)
geoffclare   
2023-09-07 15:43   
(edited on: 2023-10-05 15:08)
After page 3275 line 111227 section printf, add two paragraphs to FUTURE DIRECTIONS:
A future version of this standard is expected to add a %b conversion to the printf() function for binary conversion of integers, in alignment with the next version of the ISO C standard. This will result in an inconsistency between the printf utility and printf() function for format strings containing %b. Implementors are encouraged to collaborate on a way to address this which could then be adopted in a future version of this standard. For example, the printf utility could add a -C option to make the format string behave in the same way, as far as possible, as the printf() function.

A future version of this standard may add a %q conversion to convert a string argument to a quoted output format that can be reused as shell input.






Viewing Issue Advanced Details
1772 [1003.1(2016/18)/Issue7+TC2] Shell and Utilities Objection Clarification Requested 2023-08-21 09:58 2023-09-04 10:37
geoffclare
 
normal  
Applied  
Accepted As Marked  
   
Geoff Clare
The Open Group
make
2971
98562
---
See Note: 0006434.
make's ASYNCHRONOUS EVENTS has several problems
The ASYNCHRONOUS EVENTS section for the make utility says:
If not already ignored, make shall trap SIGHUP, SIGTERM, SIGINT, and SIGQUIT and remove the current target unless the target is a directory or the target is a prerequisite of the special target .PRECIOUS or unless one of the -n, -p, or -q options was specified. Any targets removed in this manner shall be reported in diagnostic messages of unspecified format, written to standard error. After this cleanup process, if any, make shall take the standard action for all other signals.
There are several problems with this...

1. It says how SIGHUP, SIGTERM, SIGINT, and SIGQUIT are handled when several conditions are all met, but says nothing about how they are handled when any of those conditions is not met.

2. It assumes there is always a "current target". This is not the case if, for example, a signal is received while "make -f -" is reading a makefile supplied on standard input.

3. It requires a "diagnostic message" to be written to standard error, which means (as per 1.4 Utility Description Defaults) that the exit status must indicate that an error occurred, but some implementations instead set the signal to default and re-signal the process to terminate it.

4. It is not clear how the phrase "After this cleanup process, if any" is supposed to be understood. Since "this" refers back to a previously described cleanup process of removing the current target, is "if any" supposed to be making this removal optional for signals other than SIGHUP, SIGTERM, SIGINT, and SIGQUIT? If the removal is allowed it would have to be without the diagnostic message, which seems unlikely to be what was intended. And if removal is allowed for other signals, that includes signals that do not terminate the process (such as SIGTSTP and SIGCONT), which obviously would not have been intended. None of the implementations I tried removed the target for other signals. (I tried Solaris, GNU, and Debian's BSD-derived bmake, but not a "real" BSD make.)

Finally, GNU make doesn't do the specified target removal for SIGHUP, SIGTERM, SIGINT, and SIGQUIT. This could either be treated as a non-conformance in GNU make that they should put right (OPTION 1), or we could allow either behaviour in the standard (OPTION 2).
On page 2339 line 74433 section 1.4, change:
Default Behavior: When this section is listed as ``Default.'', or it refers to ``the standard action for all other signals; see Section 1.4 (on page 2336)'' it means ...
to:
Default Behavior: When this section is listed as ``Default.'', or it refers to ``the standard action'' for any signal, it means ...

On page 2971 line 98562 section make, after applying bug 523 replace the text of the ASYNCHRONOUS EVENTS section with:
OPTION 1

For SIGHUP, SIGINT, SIGQUIT, and SIGTERM signals, if the signal was not inherited as ignored, none of the -n, -p, or -q options was specified, make is currently processing a target or inference rule, the current target is not a directory, and the target is not a prerequisite of the special targets .PHONY or .PRECIOUS:
  • make shall catch the signal and remove the current target; any targets removed in this manner shall be reported in diagnostic or informational messages of unspecified format, written to standard error.

  • If make writes a diagnostic message to standard error, it shall exit with a status that indicates an error occurred; otherwise, it shall set the signal to default and re-signal itself.
In all other circumstances, make shall take the standard action for all signals; see [xref to 1.4].

OPTION 2

As above but with "may catch the signal" instead of "shall catch the signal"

On page 2972 line 98577 section make, change STDERR from:
The standard error shall be used only for diagnostic messages.
to:
The standard error shall be used for diagnostic messages and may be used for informational messages about target removals (see ASYNCHRONOUS EVENTS).

Notes
(0006434)
Don Cragun   
2023-08-21 16:27   
Make the following changes:
On page 2339 line 74433 section 1.4, change:
Default Behavior: When this section is listed as ``Default.'', or it refers to ``the standard action for all other signals; see Section 1.4 (on page 2336)'' it means ...
to:
Default Behavior: When this section is listed as ``Default.'', or it refers to ``the standard action'' for any signal, it means ...

On page 2971 line 98562 section make, after applying bug 523 replace the text of the ASYNCHRONOUS EVENTS section with:
For SIGHUP, SIGINT, SIGQUIT, and SIGTERM signals, if the signal was not inherited as ignored, none of the -n, -p, or -q options was specified, make is currently processing a target or inference rule, the current target is neither a directory nor a prerequisite of the special targets .PHONY or .PRECIOUS:
  • make shall catch the signal and, if the time of last data modification of the current target has changed since make began processing the rule to bring that target up to date, remove that target; it may also remove that target if the time of last data modification has not changed. Any targets removed in this manner shall be reported in diagnostic or informational messages of unspecified format, written to standard error.
  • If make writes a diagnostic message to standard error, it shall exit with a status that indicates an error occurred; otherwise, it shall set the signal to default and re-signal itself.
In all other circumstances, make shall take the standard action for all signals; see [xref to 1.4].



On page 2972 line 98577 section make, change STDERR from:
The standard error shall be used only for diagnostic messages.
to:
The standard error shall be used for diagnostic messages and may be used for informational messages about target removals (see ASYNCHRONOUS EVENTS).




Viewing Issue Advanced Details
1773 [Issue 8 drafts] Base Definitions and Headers Editorial Error 2023-08-24 13:30 2023-09-05 11:21
geoffclare
 
normal  
Applied  
Accepted  
   
Geoff Clare
The Open Group
1.1,1.4,1.6,<fcntl.h>,...
4,5,6,250,...
multiple
Address the various Notes to Reviewers
Draft 4 seems like the right time to deal with the various Notes to Reviewers.

The suggested changes in the desired action are mostly self-explanatory, but the situation for O_NOCLOBBER is worth mentioning: as far as I can tell nobody has implemented O_NOCLOBBER yet, so the notes suggesting "If any implementations have added support for O_NOCLOBBER before Issue 8 is finalized, this could be changed" should all just be deleted.

Also, there was one formal IEEE Interpretation Request against IEEE Std 1003.1-2008 (which was captured in Mantis as bug 240). However, I was unable to find any ISO/IEC defect reports against ISO/IEC 9945:2009; it would be good if the ISO/IEC OR could confirm this.
On page 4 line 44 section 1.1, change:
Issues raised by Austin Group defect reports, IEEE Interpretations against IEEE Std 1003.1, and ISO/IEC defect reports against ISO/IEC 9945
to:
Issues raised by Austin Group defect reports and IEEE Interpretations against IEEE Std 1003.1.

On page 4 line 45 section 1.1, delete the note that says:
Were there any IEEE Interpretations against IEEE Std 1003.1 or ISO/IEC defect reports against ISO/IEC 9945?

On page 5 line 78 section 1.4, delete the note that says:
This list will be updated in a later draft.

On page 5 line 85 section 1.4, change:
ISO 4217:2001
ISO 4217:2001, Codes for the Representation of Currencies and Funds.
to:
ISO 4217:2015
ISO 4217:2015, Codes for the representation of currencies.

Note to the editor: also change the Z& troff string to match.

On page 5 line 87 section 1.4, change:
ISO 8601:2004
ISO 8601:2004, Data Elements and Interchange Formats -- Information Interchange -- Representation of Dates and Times.
to:
ISO 8601-1:2019
ISO 8601-1:2019, Date and time -- Representations for information interchange -- Part 1: Basic rules

Note to the editor: also change the Z0 troff string to match.

On page 5 line 90 section 1.4, change:
ISO C (1999)
ISO/IEC 9899:1999, Programming Languages -- C, including ISO/IEC 9899:1999/Cor.1:2001(E), ISO/IEC 9899:1999/Cor.2:2004(E), and ISO/IEC 9899:1999/Cor.3.
to:
ISO C (C17)
ISO/IEC 9899:2018, Programming Languages -- C

On page 5 line 87 section 1.4, change:
ISO/IEC 10646-1:2000
ISO/IEC 10646-1:2000, Information Technology -- Universal Multiple-Octet Coded Character Set (UCS) -- Part 1: Architecture and Basic Multilingual Plane.
to:
ISO/IEC 10646:2020
Information technology -- Universal coded character set (UCS)

Note to the editor: also change the ZN troff string to match.

On page 6 line 102 section 1.6, this note:
This section has some overlap with the IEEE-mandated ``Word Usage'' section above. There does not appear to be any conflict, but here we distinguish between implementations and applications, and so we cannot simply refer to ``Word Usage'' for the words covered there. Also, according to XRAT, the meanings specified here for the words shall, should, and may are mandated by ISO/IEC directives.
is information that could still be useful for reviewers of draft 4, and
so should remain for now.

On page 250 line 8776 section <fcntl.h>, and
page 1514 line 50795 section open(), delete the note that says:
The following represents Option 1 of Mantis bug 1016. If any implementations have added support for O_NOCLOBBER before Issue 8 is finalized, this could be changed to Option 2.

On page 491 line 17286 section 1.1, delete the note that says:
This list will be updated in a later draft (at the same time as XBD ``Normative References'').

On page 491 line 17290 section 1.1, change:
ISO C (1999)
ISO/IEC 9899:1999, Programming Languages -- C, including ISO/IEC 9899:1999/Cor.1:2001(E), ISO/IEC 9899:1999/Cor.2:2004(E), and ISO/IEC 9899:1999/Cor.3.
to:
ISO C (C17)
ISO/IEC 9899:2018, Programming Languages -- C

On page 491 line 17294 section 1.1, change:
9899:1999
to:
9899:2018

On page 2478 line 80565 section 2.7.2, delete the note that says:
The following two paragraphs represent Option 1 of Mantis bug 1016. If any implementations have added support for O_NOCLOBBER before Issue 8 is finalized, this could be changed to Option 2. If Option 2 is adopted in a future draft, note that the change from https://www.austingroupbugs.net/view.php?id=1364 [^] will need to be reapplied to the Option 2 text.

On page 3695 line 127028 section B.1.1, and
page 3819 line 132371 section C.1.1, change:
Austin Group defect reports, IEEE Interpretations against IEEE Std 1003.1, and responses to ISO/IEC defect reports against ISO/IEC 9945 are applied.
to:
Austin Group defect reports and IEEE Interpretations against IEEE Std 1003.1 are applied.

page 3695 line 127030 section B.1.1, and
page 3819 line 132373 section C.1.1, delete the note that says:
Were there any IEEE Interpretations against IEEE Std 1003.1 or ISO/IEC defect reports against ISO/IEC 9945?

On page 3855 line 133916 section C.2.7.2, delete the note that says:
The following represents Option 1 of Mantis bug 1016. If any implementations have added support for O_NOCLOBBER before Issue 8 is finalized, this could be changed to Option 2. If Option 2 is adopted in a future draft, note that the change from https://www.austingroupbugs.net/view.php?id=1364 [^] will need to be reapplied to the Option 2 text.

Notes




Viewing Issue Advanced Details
1775 [1003.1(2016/18)/Issue7+TC2] Base Definitions and Headers Editorial Enhancement Request 2023-09-13 19:56 2023-12-07 14:08
enh
 
normal  
Applied  
Accepted  
   
Elliott Hughes
Google
<signal.h> siginfo_t si_addr description
336
11382
---
si_addr description confusingly over-specific
the description of siginfo_t::si_addr is currently:
```
void *si_addr Address of faulting instruction.
```
which is over-specific to SIGILL.
something like this seems like it would be more accurate/less confusing:
```
void *si_addr Address that caused fault.
```
There are no notes attached to this issue.




Viewing Issue Advanced Details
1776 [Issue 8 drafts] Shell and Utilities Objection Error 2023-09-29 15:07 2023-12-07 14:10
stephane
 
normal  
Applied  
Accepted As Marked  
   
Stéphane Chazelas
find utility
2920, 2924
97596, 97747
Note: 0006578
find -newer any symlinks
The find specification has:

> -newer file
>
> The primary shall evaluate as true if the modification time of the current file is
> more recent than the modification time of the file named by the pathname file.

First maybe clarify that if relative, it's relative to the working directory at the time find was started.

That text doesn't say whether the modification time of "file" should be obtained before (lstat()) or after (stat()) symlink resolution, nor what should happen if that cannot be obtained, nor the effect of -L and -H in that regard.

What I find is that most implementations I've tried do a stat() on the file while GNU find does a lstat() unless -L/-H is provided (and the behaviour is clearly documented there).

All the ones I tried fail immediately with an error if the modification time of the file cannot be obtained and those that use stat() don't attempt a lstat() if stat() fails.

In implementations that do stat() it can happen that both:

find a -prune -newer b
find b -prune -newer a

print or don't print as find does a lstat() on the path operand and stat() on the reference file. For example, if "a" is a new symlink to an old file, "a" would be newer than "b" because find compares the age of the symlink with the age of "b" while "b" would also be newer than "a" because in that direction we compare the age of b with that of the old file "a" links to.

With -L/-H, the behaviour is consistent across implementations as stat() is performed on all files.

In any case, the example at line 97747 is incorrect without -H/-L as test's -nt compares files after symlink resolution (also note that the behaviour varies across test implementations when operand cannot be stat()ed).

IMO, GNU find's behaviour is better for the symmetry consideration and as it allows one to compare the age of symlinks. Having the possibility to chose between stat() and lstat() is also preferable for -samefile and -newerXY which POSIX might consider adding in the future.
- clarify how pathname resolution is to be performed in case path is a relative path
- either specify the GNU find behaviour or leave it unspecified whether the mtime of the file is obtained before or after symlink resolution if -H/-L are not provided.
- clarify that it shall be an error if the mtime of the file cannot be obtained
- fix the example at line 97747 to use -H, and maybe add a "provided both file1 and file2 are accessible".
Notes
(0006578)
geoffclare   
2023-11-20 17:30   
On page 2920 line 97596 section find (-newer), change:
The primary shall evaluate as true if the modification time of the current file is more recent than the modification time of the file named by the pathname file.
to:
The primary shall evaluate as true if the modification time of the current file is more recent than the modification time of the file named by the pathname file. If file names a symbolic link, the modification time used shall be that of the file referenced by the symbolic link if either the -H or -L is specified; if neither -H nor -L is specified, it is unspecified whether the modification time is that of the symbolic link itself or of the file referenced by the symbolic link. In either case, if the referenced file does not exist, the modification time used shall be that of the link itself. If file is a relative pathname, it shall be resolved relative to the current working directory that was inherited by find when it was invoked.




Viewing Issue Advanced Details
1777 [Issue 8 drafts] Shell and Utilities Objection Error 2023-09-29 15:32 2023-12-07 14:12
stephane
 
normal  
Applied  
Accepted As Marked  
   
Stephane Chazelas
find utility EXAMPLES
2923
97767 and below
See Note: 0006591.
inaccuracy in find example 10
On example 10: -size +199999 yields true for files whose size round up to an integer number of 512 byte units is strictly greater than 199999 so that's for files that are 199999*512+1 bytes or larger, not "100000 KiB or larger".
Either change the code to -size +200000 and the text to "for files of size larger than 100000KiB"

or

keep the text and change the code to -size +102399999c or to '(' -size 200000 -o -size +200000 ')' or ! -size -200000
Notes
(0006591)
Don Cragun   
2023-11-27 16:13   
(edited on: 2023-11-27 16:27)
Change on P2924 line 97768 (find EXAMPLES):
searches the file hierarchy for files of size 100 000 KiB or larger
to:
searches the file hierarchy for files of size larger than 100 000 KiB

    
Change on P2924, L97770 (find EXAMPLES):
<pr>find / -path /media -prune -o -size +199999 -print</pr>
to:
<pr>find / −path /media −prune −o −size +200000 −print</pr>






Viewing Issue Advanced Details
1778 [Issue 8 drafts] Shell and Utilities Objection Enhancement Request 2023-10-02 13:58 2023-12-07 14:20
kre
 
normal  
Applied  
Accepted As Marked  
   
Robert Elz
XCU 3/read
3291-3294
111869-111878, 111961-111963, 11946, 11979-11980
See Note: 0006592
The read utility needs field splitting updates/corrections )and a little more)
The description of how field splitting is used to process the input
and allocate it to the named variables has similar problems as those
with field splitting itself. The latter is being fixed in bug:1649.

The description of how this is done with read needs similar updates
(though less extensive) - whether or not bug:1649 is finally accepted.
 
In addition, starting on line 11961 the description of read says:
 
   (If IFS is not set to the null string this applies even when using -d "",
   because the field splitting performed by read is a character-based
   operation.)

which I am not sure remains true after the effects of bug:1560 (as
reinterpreted perhaps in bug:1649) are applied, but I will leave it
to others more knowledgble about what is or was intended here, and
whether or not it still applies, to decide if anything needs to be
done with that sentence.
 
Also the RATIONALE says (starting line 111979)

    Since read affects the current shell execution environment,
    it is generally provided as a shell regular built-in.
   
which doesn't make a lot of sense to say to me, given that line 11946 says:

    This utility is required to be intrinsic. See Section 1.7 ...

but again I will leave it for others to decide what to do about that, and
the immediately following text, as the RATIONALE isn't exactly important.
For the parts I can suggest a solution, see a note below (I prefer to do it
that way rather than include the text here, as that way I can edit it if it
turns out to look absurd - which it very well may do).
Notes
(0006592)
nick   
2023-11-27 17:29   
On page 3291 line 111859 section read, change:
    
By default, unless the -r option is specified, <backslash> shall act as an escape character. An unescaped <backslash> shall preserve the literal value of the following character, with the exception of ...

to:
    
If the -r option is not specified, <backslash> shall act as an escape character. An unescaped <backslash> shall preserve the literal value of a following <backslash> and shall prevent a following byte (if any) from being used to split fields, with the exception of ...

On page 3291 line 111869 section read, replace the paragraph beginning
  
The terminating logical line delimiter

and continuing down past the three bullet points to line 11878 (inclusive) with the following:
    
The terminating logical line delimiter (if any) shall be removed from the input. Then if the shell variable IFS (see [xref XCU 2.5.3]) is set, and its value is an empty string, the resulting data shall be assigned to the variable named by the first var operand, and the variables named by other var operands (if any) shall be set to the empty string. No other processing shall be performed in this case.

If IFS is unset, or is set to any non-empty value, then a modified version of the field splitting algorithm specified in [xref XCU 2.6.5] shall be applied, with the modifications as follows:
    

          
  1. The input to the algorithm shall be the logical line (minus terminating delimiter) that was read from standard input, and shall be considered as a single initial field, all of which resulted from expansions, with any escaped byte and the preceding <backslash> escape character treated as if they were the result of a quoted expansion, and all other bytes treated as if they were the results of unquoted expansions.

  2.     
  3. The loop over the contents of that initial field shall cease when either the input is empty or n output fields have been generated, where n is one less than the number of var operands passed to the read utility. Any remaining input in the original field being processed shall be returned to the read utility ``unsplit''; that is, unmodified except that any leading or trailing IFS white space, as defined in [xref to XCU 2.6.5], shall be removed.

  4.   

The specified var operands shall be processed in the order they appear on the command line, and the output fields generated by the field splitting algorithm shall be used in the order they were generated, by repeating the following checks until neither is true:

      
  • If more than one var operand is yet to be processed and one or more output fields are yet to be used, the variable named by the first unprocessed var operand shall be assigned the value of the first unused output field.

  •   
  • If exactly one var operand is yet to be processed and there was some remaining unsplit input returned from the modified field splitting algorithm, the variable named by the unprocessed var operand shall be assigned the unsplit input.
If there are still one or more unprocessed var operands, each of the variables names by those operands shall be assigned an empty string.

Note that in the case where just one var operand is given on the read command line, the modified field splitting algorithm ceases after producing zero output fields and simply returns the original input field, with any leading and trailing IFS white space removed, as unsplit input. This unsplit input is assigned to the variable named by the var operand.


On page 3292 line 111900 section read (OPERANDS), after:
    
The name of an existing or nonexisting shell variable.

append:
    
If a var operand names the variable IFS, the behavior is unspecified.
    
    If a var operand names one of the variables LANG, LC_CTYPE, or LC_ALL and the new value assigned to the variable would change how the bytes in IFS form characters, or which characters in IFS are considered to be IFS white space (see [xref XCU 2.6.5]), it is unspecified what effects, if any, the change has on how read performs field splitting.

    
On page 3292 line 111902 section read (STDIN), change:
    
If the -d delim option is not specified, or if it is specified and delim consists of one single-byte character, the standard input shall contain zero or more characters and shall not contain any null bytes.

to:
    
If the -d delim option is not specified, or if it is specified and delim is not the null string, the standard input shall contain zero or more bytes (which need not form valid characters) and shall not contain any null bytes.

On page 3293 line 111959 section read (APPLICATION USAGE), change:
    
When the current locale is not the C or POSIX locale, pathnames can contain bytes that do not form part of a valid character, and therefore portable applications need to ensure that the current locale is the C or POSIX locale when using read with arbitrary pathnames as input. (If IFS is not set to the null string this applies even when using -d "", because the field splitting performed by read is a character-based operation.) When reading a pathname it is also inadvisable ...

to:
    
When reading a pathname it is inadvisable ...

After page 3293 line 111967 section read (APPLICATION USAGE), add:
    
Since the var operands are processed in the order specified on the command line, if any variable name is specified more than once as a var operand, the last assignment made is the one that is in effect when read returns, including when an empty string is assigned because no field data was available.


On page 3293, lines 111979-111980, change:
    
Since read affects the current shell execution environment, it is generally provided as a shell |regular built-in.

to:
    
Since read affects the current shell execution environment, it is required to be intrinsic.




Viewing Issue Advanced Details
1779 [Issue 8 drafts] Shell and Utilities Objection Omission 2023-10-16 22:07 2023-12-07 14:22
kre
 
normal  
Applied  
Accepted As Marked  
   
Robert Elz
XCU 3/read
3291
111880-111882
See Note: 0006593.
Standard does not say how much read should do if a read-only variable is given
This issue is an offshoot from 0001788 as requested in Note: 0006534

As Note: 0006534 says, the standard currently says (lines 11180-2 of I.8 D.3)

   An error in setting any variable (such as if a var has previously been
   marked readonly) shall be considered an error of read processing, and shall
   result in a return value greater than one.

Note: 0006534 goes on to say:

   It is silent about when read exits (immediately or after processing later
   operands) so both behaviours are allowed.

It is worse than that, there are 5 behaviours I can imagine which would be
allowed by silence (unspecified because no-one thought to specify it)

   . The vars are checked to see if any is read only before read does
       any processing, and read does exit(2) before it even attempts to
       read anything if a read only var is detected. No vars altered.
   . the same check is done but after the input has been read, but before
       field splitting, no vars altered.
   . as each var is assigned a value, if it is read only (or the assignment
       fails for any other reason, such as ENOMEM to a malloc() request) then
       all processing stops, that var, and any following it are unchanged,
       previous vars have been assigned values already, and read does exit(2).
   . as each var is assigned a value, if it is read only (or the assignment
       fails for any other reason) the variable in question is left unaltered,
       but processing continues and other vars (if possible) are assigned
       values as if the one that failed had also been assigned a value. When
       complete, read does exit(2)
   . as each var is assigned a value, if it is read only (or the assignment
       fails for any other reason) the variable in question is left unaltered,
       all remaining vars are set to the null string (if assignments to them
       are possible) and read does exit(2)

As best I can tell only the 3rd and 4th of those actually exist in shells.

I have a general hatred of "implicitly unspecified" as this currently is,
as that can lead to more varying implementation techniques (there may be
more possible than I considered above ... perhaps simply setting all vars
which can be set to the empty string if an assignment to any of them fails,
kind of a combination of number 1, or 2, with number 5) when ideally we'd
like to converge upon one behaviour that scripts could rely upon.
Append after the sentence quoted in the description from lines 111880...
of I.8 D.3 the following new sentence (Before "If it is called in a subshell...")

Variables named before the one generating the error shall be
set as described above, it is unspecified whether variables named later
shall be set as above, or whether read simply ceases processing when
the error occurs, leaving later named variables unaltered

Notes
(0006593)
Don Cragun   
2023-11-30 16:28   
After bug 0001778 has been applied...
Before P3291 L111882:
If it is called in a subshell...

add a new sentence:
Variables named before the one generating the error shall be
set as described above; it is unspecified whether variables named later
shall be set as above, or read simply ceases processing when
the error occurs, leaving later named variables unaltered.




Viewing Issue Advanced Details
1780 [1003.1(2016/18)/Issue7+TC2] Rationale Editorial Clarification Requested 2023-10-17 02:34 2023-11-14 10:39
dannyniu
 
normal  
Applied  
Accepted As Marked  
   
DannyNiu/NJF
<individual>
c181.pdf
B.2.8 Realtime >> Message Passing
3590
122093-122095
---
Note: 0006562
The punctuation is confusing/ambiguous
Towards the end of the rationale for "Priotization of Messages", this is said:

> The prioritization does add additional overhead to the message operations in those cases it is actually used but a clever implementation can optimize for the FIFO case to make that more efficient

It is confusing when reading towards the middle:

"... overhead to the message operations in those cases it is actually ..."

I suspect a comma is missing between "operations" and "in those cases".
Clarify this paragraph.
Notes
(0006562)
geoffclare   
2023-11-02 15:10   
Change:
The prioritization does add additional overhead to the message operations in those cases it is actually used but a clever implementation can optimize for the FIFO case to make that more efficient.
to:
The prioritization does add additional overhead to the message operations, in those cases it is actually used, but a clever implementation can optimize for the FIFO case to make that more efficient.




Viewing Issue Advanced Details
1781 [1003.1(2016/18)/Issue7+TC2] Rationale Editorial Clarification Requested 2023-10-17 08:53 2023-11-14 10:41
dannyniu
 
normal  
Applied  
Accepted As Marked  
   
DannyNiu/NJF
<individual>
C181.pdf
B.2.8 Realtime >> Rationale for the Monotonic Clock
3622
123529-123533
---
Note: 0006563
Factual error due to mis-wording with regard to timed wait functions?
At the aforementioned line range, this is said:

> It was decided that the features of CLOCK_MONOTONIC are not as critical to these functions as they are to pthread_cond_timedwait(). The pthread_cond_timedwait() function is given a relative timeout; the timeout may represent a deadline for an event. When these functions are given relative timeouts, the timeouts are typically for error recovery purposes and need not be so precise.

The `pthread_cond_timedwait()` is specified with absolute time-outs, and the later sentence doesn't make sense with the earilier parts of the paragraph.

It probably should say:

> the *other* function is given an absolute timeout; the timeout may represent a deadline for an event. When other functions are given relative timeouts ... typically for error recovery ...
Clarify the intended wording.

BTW, my previous reported issue should probably be moved to Issue7+TC2 as well, I forgot to select the correct project.
Notes
(0006563)
geoffclare   
2023-11-02 15:18   
Change:
The pthread_cond_timedwait() function is given a relative timeout; the timeout may represent a deadline for an event. When these functions are given relative timeouts, the timeouts are typically for error recovery purposes and need not be so precise.
to:
The pthread_cond_timedwait() function is given an absolute timeout; the timeout may represent a deadline for an event. When other functions are given relative timeouts, the timeouts are typically for error recovery purposes and need not be so precise.




Viewing Issue Advanced Details
1782 [1003.1(2016/18)/Issue7+TC2] Rationale Objection Enhancement Request 2023-10-17 11:30 2023-11-14 10:44
dannyniu
 
normal  
Applied  
Accepted As Marked  
   
DannyNiu/NJF
<individual>
c181.pdf
B.2.8 Realtime >> Rationale Relating to Timeouts
3628
123791
---
Note: 0006564
Standardize "pthread_yield"? Or change the example code?
In the example code at page 3628, the example used the `pthread_yield()` function.

This function is a common-place extension in most Pthread implementations, however, it was not included in the normative text of the current standard. Instead, `sched_yield()` is specified and included in the base.

C11 has the `thrd_yield()` function, which is part of the "threads.h" header.
Standardize `pthread_yield()` (as C11 has standardized `thrd_yield()`)
OR
replace the call in the code example with `sched_yield()`
OR
some other more preferrable solution?
Notes
(0006564)
geoffclare   
2023-11-02 15:31   
Change:
pthread_yield();
to:
sched_yield();

The editor may also change the brace indentation to match other parts of the standard. (lines 123780-123794 and lines 123804-123847).




Viewing Issue Advanced Details
1783 [1003.1(2016/18)/Issue7+TC2] Rationale Editorial Error 2023-10-18 04:21 2023-11-14 10:45
dannyniu
 
normal  
Applied  
Accepted  
   
DannyNiu/NJF
<individual>
c181.pdf
B2.9.1 Thread-Safety
3647
124665-124666
---
This sentence from this section is discussing thread safety isn't it?
At the aforementioned lines, this is said:

> As it turns out, some functions are inherently not thread-safe; that is, their interface specifications preclude async-signal-safety.

Notice towards the end of this sentence, it says "async-signal-safety". However, this section is dealing with "thread safety", so I think this is probably a typo.
Change "async-signal-safety" to "thread-safety" if that's what's actually intended.
Notes
(0006565)
Don Cragun   
2023-11-02 15:41   
The Suggested Fix is correct; that is what was intended.




Viewing Issue Advanced Details
1784 [Issue 8 drafts] Shell and Utilities Objection Error 2023-10-22 06:14 2023-12-18 10:43
kre
 
normal  
Applied  
Accepted As Marked  
   
Robert Elz
XCU 3 / getopts
2955 - 2959
98803 - 98966
See Note: 0006600.
getopts specification needs fixing (multiple issues)
First:

Line 98807
        and the index of the next argument to be processed in the
        shell variable OPTIND.

Much the same is in the ENVIRONMENT VARIABLES section, lines 98888-9
say:
        OPTIND This variable shall be used by the getopts utility as
                the index of the next argument to be processed.

Which is the "next argument to be processed" - the argument after the
one that supplied the option written into the name arg, or the
argument that will be processed by the next call to getopts ?

It makes a difference when the argument in question has two (or more)
options in it, and anything but the last of them is being processed now.

Eg: (given an optstring with "xy" in it (no colons))
        script -xy -d
if getopts is used in script to process those options, then
where name is set to 'x', this same arg will be processed again
next time to return 'y', but the "next argument" is the one
containing -d in many people's interpretation (and different shells
interpret it each way, in some OPTIND is 1 for 'x' and '2' for 'y',
in others it is 2 for both 'x' and 'y'). yash is different, it's
(intermediate) OPTIND settings contain the index of the arg being
processed, a colon, and the index of the option char within that arg
(so would be 1:2 and 1:3 in this case).

The standard is unclear what is intended here, it would be better to
simply say that the value of OPTIND at this point is unspecified, as
in practice there isn't anything much a script can do with it anyway,
even if we did pick one of the plausible interpretations. Pretending
that a simple integer is useful to the implementation (which the
definition at line 98888 does) is not helpful to anyway - to keep
track of whet it is up to, the implementation either needs to use
some other mechanism (ie: not use OPTIND for anything except when
the application does OPTIND=1) or it needs (as yash does) to encode
more than just an integer into OPTIND.

Beyond that, is the term "index of" defined anywhere? (It isn't in XBD 3)
If it is, there should be an xref, otherwise there should be a definition
given here. What is its format? For the usage when getopts returns
an exit status of 1, it is clearly intended to contain an integer, as
the EXAMPLES section, shows at like 98951

        shift $(($OPTIND - 1))

which wouldn't work if OPTIND were not an integer. But is that
also actually required of the OPTIND returned upon other invocations?

If the intent here was to rely upon the standard English use of
the term, then that fails, as there really isn't one of those, to
be useful an index has to be relative to some base, is the
first option index 0 or index 1 (or something else) ?


On line 98836 it is stated:
        The shell variables OPTIND and OPTARG shall be local to the
        caller of getopts

WTF? What is that supposed to mean, that is, what does it mean
to be local to something, and what exactly is the "caller of getopts" ??
Really!

This is particularly absurd, as in the immediately following paragraph
(lines 98840-1) it says:

        The shell variable specified by the name operand, OPTIND, and OPTARG
        shall affect the current shell execution environment;

which makes sense, and is what implementations actually do. If that
shell environment is "the caller" then what does it mean to be "local",
that it isn't allowed to be exported? That it doesn't survive the
termination of that shell environment? If this last one, then why does
it need stating, what variables do survive the termination of the shell
environment? Or was something else fanciful intended there ?


Next, at lines 98862-3

   the value in OPTARG shall be stripped of the option character and the '-'.

So, if we have an optstring of "abc:d" and the invocation of
getopts is

        getopts abc:d var -abcfoo -d

then when 'var' is set to 'c' OPTARG is supposed to be "abfoo" ? (that
is we remove the 'c' and the '-' as instructed).

No, that can't be right, the option-argument is (at least implied by)
XBD 12.1 (which isn't referenced anywhere in XCU 3/getopts - directly
or indirectly, only XBD 12.2) the string which follows the option when
it is included in the same argument as the option, so the 'ab' should not
be included, just "foo" - but the '-' does not follow the option there
either, so why is the standard saying that the '-' must be removed?

Why isn't just saying that OPTARG is the option-argument (properly
defined by an xref) and leaving it at that?

Incidentally, XBD 3.244 is not very helpful here, all it says is an
Option-Argument is:
    A parameter that follows certain options. In some cases an option-argument
    is included within the same argument string as the option--in most cases
    it is the next argument.

The "follows" is suggestive, but "included within the same argument string"
leaves more possibilities open. And why does that say "certain options" ?
If it means options that require one, those aren't "certain". Just
"some options" would be better there.


In the RATIONALE, at lines: 98964-6 :

    Although a leading <plus-sign> in optstring is required to have no
    effect on the behavior of getopt(), this standard intentionally allows
    implementations of the getopts utility to use a leading
    <plus-sign> as an extension that alters behavior.

First, I am not sure just where it intentionally does that, the RATIONALE
isn't a normative part of the standard, so that paragraph can't be it,
did I miss something? But ignoring that...

Implementations are to be allowed to support a leading '+' in optstring.
But how does that effect (at line 98821, and I think other places, like
line 98895, there might be more):

        If the first character of optstring is a <colon> ...

In XSH/getopt it is clear that the optional '+' precedes the optional ':'
in optstring, but if that is followed here, how can that ':' be the
first character of optstring? Must the application use only one or
the other, or is getopts doing the reverse of getopt() and requiring the
order be ":+..." (and if so, where does it say so) or should the wording here
be fixed so it works like the getopt() function ?

And while we're here. the first mention of options (line 98803)
should contain an xref to XBD 3.243, the first mention of option-arguments
(also on line 98803) should have an xref to XBD 3.243 and the first mention of
operand (I think on line 98831) should have an xref to XBD 3.241.
These xrefs then each refer to XBD 12.1 which shows better than the
definitions how those things are formed (particularly in bullet point 1) - but
referencing the definitions is better I think (XBD 12.1 does not refer back
to XBD 3).

Fix it all...

Maybe some wording, for some of it, may follow sometime later, in a note.
Notes
(0006600)
Don Cragun   
2023-12-11 17:50   
(edited on: 2023-12-14 16:51)
On P67, L2057-2058 (XBD 3.244 Option-Argument definition) change:
A parameter that follows certain options. In some cases an option-argument is included within the same argument string as the option—in most cases it is the next argument.
to:
A parameter that follows certain options. In some cases an option-argument immediately follows the option character within the same argument string as the option; otherwise the option-argument is the next argument string.

On page 2995 line 98801 change:
getopts optstring name [arg...]
to:
getopts optstring name [param...]


And globally rename s/arg/param/ elsewhere in the remainder of the getopts page.

On page 2995 lines 98806-98808 Change:
Each time it is invoked, the getopts utility shall place the value of the next option in the shell variable specified by the name operand and the index of the next argument to be processed in the shell variable OPTIND. Whenever the shell is invoked, OPTIND shall be initialized to 1.
to:
When the shell is first invoked, the shell variable OPTIND shall be initialized to 1. Each time getopts is invoked, it shall place the value of the next option found in the parameter list in the shell variable specified by the name operand and the shell variable OPTIND shall be set as follows:
  • When getopts successfully parses an option that takes an option-argument (that is, a character followed by <colon> in optstring, and exit status is 0), the value of OPTIND shall be the integer index of the next element of the parameter list (if any; see OPERANDS below) to be searched for an option character. Index 1 identifies the first element of the parameter list.
  • When getopts reports end of options (that is, when exit status is 1), the value of OPTIND shall be the integer index of the next element of the parameter list (if any).
  • In all other cases, the value of OPTIND is unspecified, but shall encode the information needed for the next invocation of getopts to resume parsing options after the option just parsed.

Replace Lines 98830-98835 with:
When the end of options is encountered, the getopts utility shall exit with a return value of one; the shell variable OPTIND shall be set to the index of the argument containing the first operand in the parameter list, or the value 1 plus the number of elements in the parameter list if there are no operands in the parameter list; the name variable shall be set to the <question-mark> character. Any of the following shall identify the end of options: the first "--" element of the parameter list that is not an option-argument, finding an element of the parameter list that is not an option-argument and does not begin with a '−', or encountering an error.

Change lines 98836-98837 from:
The shell variables OPTIND and OPTARG shall be local to the caller of getopts and shall not be exported by default.
to:
The shell variables OPTIND and OPTARG shall not be exported by default.

Change lines 98840-98841 from:
The shell variable specified by the name operand, OPTIND, and OPTARG shall affect the current shell execution environment;
to:
The getopts utility can affect OPTIND, OPTARG, and the shell variable specified by the name operand, within the current shell execution environment;

On P2956, L98845-98846 change:
... or with an OPTIND value modified to be a value other than 1, produces unspecified results
to:
... or with an OPTIND value modified by the application to be a value other than 1, produces unspecified results

On P2956, L98861-98863 change:
If the option-argument is not supplied as a separate argument from the option character, the value in OPTARG shall be stripped of the option character and the '−'.
to:
Whether or not the option-argument is supplied as a separate argument from the option character, the value in OPTARG shall only be the characters of the option-argument.

Change P2956, L98868-98869 from:
The getopts utility by default shall parse positional parameters passed to the invoking shell procedure. If args are given, they shall be parsed instead of the positional parameters.
to:
By default, the list of parameters parsed by the getopts utility shall be the positional parameters currently set in the invoking shell environment (<tt>"$@"</tt>). If param operands are given, they shall be parsed instead of the positional parameters. Note that the next element of the parameter list need not exist; in this case, OPTIND will be set to <tt>$#+1</tt> or the number of param operands plus 1.

After P2958, L98964-98966 that currently says::
Although a leading <plus-sign> in optstring is required to have no effect on the behavior of getopt( ), this standard intentionally allows implementations of the getopts utility to use a leading <plus-sign> as an extension that alters behavior.
add a new sentence:
In fact, a <plus-sign> anywhere in the optstring in the getopts utility produces unspecified behavior.






Viewing Issue Advanced Details
1786 [Issue 8 drafts] Shell and Utilities Objection Clarification Requested 2023-11-02 15:13 2023-12-07 15:07
eblake
 
normal  
Applied  
Accepted As Marked  
   
Eric Blake
Red Hat
ebb.ed
XCU ed
2801
92899
Note: 0006577
ed behavior on non-existing filename
This bug was created as fallout from investigating 0000251, investigating whether ed can create pathnames containing a newline.

It turns out that the standard is silent on what happens when the e and f commands are used with a filename (whether explicit or implicit) that does not yet exist. Existing implementations differ on whether e can create a file:

GNU ed:

$ rm -f hi
$ echo q | ed hi
hi: No such file or directory
$ echo $?
0
$ echo f | ed hi
hi: No such file or directory
hi
$ echo $?
0
$ ls hi
ls: cannot access 'hi': No such file or directory
$ echo w | ed hi
hi: No such file or directory
0
$ ls hi
hi


whereas on FreeBSD 13:

# rm -f hi
# echo q | ed hi
hi: No such file or directory
# echo $?
2
# echo f | ed hi
hi: No such file or directory
# echo $?
2
# echo w | ed hi
hi: No such file or directory
# ls hi
ls: hi: No such file or directory


However, all implementations I tested appear to allow f and w allow a filename argument that does not yet exist, at which point w creates that file (which matches the requirement for w at line 93110):



$ rm -f hi
$ printf 'f hi\nw\n' | ed
hi
0
$ ls hi
hi

Back in the context of 0000251, where we stated that the file command line argument to ed is treated the same as an explicit argument to the e command, other than the possibility of including a newline in the name, whether or not you can use ed to create a filename with a newline then depends on the behavior of whether a non-existing pathname is permitted with the e command. However, the proposed wording here is intended to apply as-is regardless of whether we also include the changes in 0000251.
In XCU ed EXTENDED DESCRIPTION, page 2801 line 92901, in the Edit Command text, after the sentence:
If no pathname is given, the currently remembered pathname, if any, shall be used (see the f command).
insert another sentence
If the pathname names a file that does not exist, it is unspecified whether this is treated as an error, or whether a warning is emitted in place of a byte count and the resulting buffer is left empty.


On page 2802 line 92918, in the Filename Command text, change:
If file is given, the f command shall change the currently remembered pathname to file;
to:
If file is given, the f command shall change the currently remembered pathname to file, whether or not file names an existing file;


Notes
(0006577)
geoffclare   
2023-11-20 17:21   
On page 2797 line 92714 section ed ASYNCHRONOUS EVENTS, change:
If the buffer is not empty and has changed since the last write, the
to:
If the buffer is not empty and the buffer change flag is currently set to either changed or changed-and-warned (see the EXTENDED DESCRIPTION section), the


On page 2797 line 92725 section ed STDERR, change:
The standard error shall be used only for diagnostic messages.
to:
The standard error shall be used for diagnostic messages and may be used for warning messages.


On page 2797 line 92730 section ed EXTENDED DESCRIPTION, after:
The ed utility shall operate on a copy of the file it is editing; changes made to the copy shall have no effect on the file until a w (write) command is given. The copy of the text is called the buffer.
add these sentences:
The ed utility shall keep track of whether the buffer has been modified. This shall be maintained as if via a tri-state internal flag with the state values unchanged, changed, and changed-and-warned, which is:

  • Initially set to unchanged

  • Set to changed by any command that modifies the buffer

  • Set to unchanged by an e or E command that reloads (or empties) the buffer, or a w command that writes the entire buffer

  • Set to either changed-and-warned or unchanged by an e or q command that warns an attempt was made to destroy the editor buffer

A command that makes changes to the buffer in such a way that its contents are the same after the command (for example s/a/a/) shall be considered to have modified the buffer, unless explicitly stated otherwise. In the remainder of the description, this flag is referred to as the buffer change flag.


On page 2800 line 92839 section ed EXTENDED DESCRIPTION, change:
If changes have been made in the buffer since the last w command that wrote the entire buffer
to:
If the buffer change flag is currently set to changed


On page 2800 line 92844, change:
... and shall continue in command mode with the current line number unchanged. If the e or q command is repeated with no intervening command, it shall take effect.
to:
... and shall continue in command mode with the buffer change flag set to either changed-and-warned or unchanged and the current line number unchanged. If another e or q command is then attempted with no intervening command that sets the buffer change flag to changed, it shall take effect.


On page 2801 line 92881, in the Append Command text, add a sentence:
If <text> is empty (that is, the terminating '.' immediately follows the 'a'), the buffer change flag shall not be altered.


On page 2801 line 92901, in the Edit Command text, after the sentence:
If no pathname is given, the currently remembered pathname, if any, shall be used (see the f command).
insert another sentence:
If the pathname names a file that does not exist and the buffer change flag is currently set to unchanged, it is unspecified whether this is treated as an error, or whether the resulting buffer is emptied and a warning is written to standard error instead of writing the byte count to standard out.


On page 2801 line 92909, in the Edit Command text, change:
If the buffer has changed since the last time the entire buffer was written
to:
If the buffer change flag is currently set to changed


On page 2802 line 92914, in the Edit Without Checking Command text, change:
shall not check to see whether any changes have been made to the buffer since the last w command.
to:
shall not check the current state of the buffer change flag.


On page 2802 line 92918, in the Filename Command text, change:
If file is given, the f command shall change the currently remembered pathname to file;
to:
If file is given, the f command shall change the currently remembered pathname to file, whether or not file names an existing file;


On page 2803 line 92977, in the Insert Command text, add a sentence:
If <text> is empty (that is, the terminating '.' immediately follows the 'i'), the buffer change flag shall not be altered.


On page 2805 Read Command, add a new paragraph at the end (after line 93044):
If the number of bytes read is 0 it is unspecified whether the buffer change flag is set to changed or left unaltered.


On page 2805 line 93027, in the Quit Command text, change:
If the buffer has changed since the last time the entire buffer was written
to:
If the buffer change flag is currently set to changed


On page 2805 line 93031, in the Quit Without Checking Command text, change:
without checking whether changes have been made in the buffer since the last w command
to:
without checking the current state of the buffer change flag


On page 2807 line 93120, in the Write Command text, change:
This usage of the write command with '!' shall not be considered as a ``last w command that wrote the entire buffer'', as described previously; thus, this alone ...
to:
This usage of the w command with '!' shall not alter the buffer change flag; thus, this alone ...


After page 2811 line 93292 section ed RATIONALE, add a paragraph:
Implementations are encouraged to set the buffer change flag to changed-and-warned when an e or q command warns that an attempt was made to destroy the editor buffer. Some existing implementations set it to unchanged, but this has the undesirable side-effect that a SIGHUP received after the warning is given does not write the buffer to ed.hup.


On page 2811 line 93294, change FUTURE DIRECTIONS from "None" to:
A future version of this standard may require that the buffer change flag is set to changed-and-warned when an e or q command warns that an attempt was made to destroy the editor buffer.




Viewing Issue Advanced Details
1787 [1003.1(2016/18)/Issue7+TC2] System Interfaces Editorial Error 2023-11-03 12:13 2023-12-07 14:14
wakely
 
normal  
Applied  
Accepted As Marked  
   
Jonathan Wakely
Red Hat
strcpy
2029
65025
---
See Note: 0006566.
strcpy summary says it returns a pointer to the end of the result
The NAME for strcpy says:

"stpcpy, strcpy - copy a string and return a pointer to the end of the result"

This is correct for stpcpy, but strcpy returns a pointer to the start of the result.

Fix the description to be valid for both stpcpy and strcpy.
Notes
(0006566)
geoffclare   
2023-11-06 10:11   
In Issue 6 and earlier the NAME section was simply:
strcpy - copy a string
It was "broken" when stpcpy() was added in Issue 7. I suggest reverting to the original text, i.e. change the NAME section to:
stpcpy, strcpy - copy a string