Email List: Xaustin-group-futures-lX
[All Lists]

Re: Some RFEs that (may) deserve discussion

To: austin-group-futures-l@xxxxxxxxxxxxx
Subject: Re: Some RFEs that (may) deserve discussion
From: Geoff Clare <gwc@xxxxxxxxxxxxx>
Date: Wed, 26 Mar 2008 11:25:01 +0000
References: <47E17314.1080403@hccfl.edu> <200803192009.m2JK9qE04066@mailman.opengroup.org> <47E1740E.9010305@acm.org>
Wayne Pollock <pollock@acm.org> wrote, on 19 Mar 2008:
>
> Normally I would guess to post each item in a separate
> email, to maintain proper threads.  But I'm not sure
> if any of this is right to post in this list, so
> I didn't want to clutter up the list.  If this is
> the right content for this list, please post any
> follow-ups in in their own threads?  (Or I can
> repost as many separate posts.)

Most of these look like they are worth considering for the next revision.
I've commented on a few of them.  I didn't think it was worth putting
comments in separate threads (for now) as there is so little traffic on
this list anyway.

> ----------
> Allow sleep(1) sub-second times
> ----------
> Define awk's rand() function more completely; it is
> currently the only user-level command that can generate
> random numbers.

Obviously this one will depend on the detail of what you want to
specify.  It should only be okay as long as it doesn't break any
existing awk implementations.

> ----------
> Add/define shell RANDOM as lcg type
> Add/define /dev/random and /dev/urandom as blocking
> and suitable for cryptographic uses, and as non-blocking
> suitable for other uses.
> ----------
> Define locale names as:
>     a pathname containing slashes
>     locale.<string>  # for non-standard or locally defined names,
>                        with some appropriate limits on <string>
>     "POSIX"
>     "C"
> or a name in the format:
>     language[_territory][.codeset][@modifier]
> where "language" is an ISO 639 language code,
>       "territory" is an ISO 3166 country code,
>       "codeset" is a character set identifier like
>       "ISO-8859-1" or "UTF-8", and
>       "modifier" is any string up up to X characters from
>       the POSIX portable character set.
> 
> One problem is there is no standard list of codesets.
> The IANA maintains a list at
>      www.iana.org/assignments/character-sets
> but it is not definitive nor does it include many
> commonly used charset names.  For example "utf8" is
> not listed as an alias of "UTF-8", but Fedora Linux
> systems (and presumably other Gnu-based systems) don't include
> "*.UTF-8" in the list of supported locales!  "*.utf8" locale
> names are used (according to the output of "locale -a").
> 
> Sun defines the IANA list as the official JRE list for Java.
> 
> Maybe it's time for a living POSIX standard for this?
> ------------
> tar options for all extended attributes plus ACLs.  (Currently
> only "star" supports this reliably on all systems, using
> non-standard archive type and other options.  pax was supposed
> to but it doesn't.)

Since ACLs aren't in the standard, it can't specify whether/how pax
handles them.

I don't know enough about POSIX.1e and the reasons it was abandoned
to comment on whether it might be worth proposing to adopt just the
ACLs from it into POSIX.1.

> ------------
> Move documentation from the obscure notes location (I forget now
> where I found this) to the I/O redirection page, that
> using ">" redirection is an atomic "test and create if missing"
> operation when noclobber mode is set.
> ------------
> Add a mktemp(1) utility
> ------------
> read(1) clarification of backslash behavior:
> In read(1) a backslash <newline> is removed
> from the input regardless of the IFS setting.
> Otherwise a backslash escapes the word
> separators specified by the IFS setting.
> A backslash followed by any other character
> results in implementation defined [or unspecified ?]
> behavior.

The handling of backslashes by the read utility has been clarified
in draft 4:

    By default, unless the -r option is specified, <backslash> ('\')
    shall act as an escape character.  An unescaped <backslash> shall
    preserve the literal value of the following character, with the
    exception of a <newline>. If a <newline> follows the <backslash>,
    the read utility shall interpret this as line continuation. The
    <backslash> and <newline> shall be removed before splitting the
    input into fields. All other unescaped <backslash> characters
    shall be removed after splitting the input into fields.

I think the only difference from your proposal is that backslash
escapes all non-newline characters, not just IFS characters.

> -----------
> pathname expansion
> There is a readlink sys call defined, but no
> user-level commands for this.  I suggest adopting
> Gnu readlink(1) as a starting point.

Adding a readlink utility is worth considering, but it is already
possible (if awkward) to use ls -l to obtain the contents of a
symlink.  The following function works as long as the pathname
doesn't contain the string " -> ".  It could be extended to handle
such pathnames by removing a " -> " string from the ls -l output
for each " -> " in the pathname.

readlink() {
    case $1 in
    *" -> "*)
        printf >&2 \
            'readlink: %s: cannot handle pathnames containing " -> "\n' "$1"
        return 1
        ;;
    esac
    readlink_tmp=$(ls -l -- "$1" && echo .) || return $?
    case $readlink_tmp in
    l*) ;;
    *)
        printf >&2 'readlink: %s is not a symbolic link\n' "$1"
        unset readlink_tmp
        return 1
        ;;
    esac
    readlink_tmp=${readlink_tmp#*" -> "}
    printf '%s' "${readlink_tmp%.}" || return $?
    unset readlink_tmp
    return 0
}

> -----------
> date(1) command extension to format a given date ("-d date").
> Also a format conversion specs for RFC-2822 and RFC-3339
> formats.
> -----------
> Additional find(1) search criteria, for tasks that are difficult
> or impossible with the current limited set of criteria.

Several of the GNU find extensions are probably worth considering,
although I would object to adding -maxdepth and -mindepth since
they don't follow the standard convention for numeric primaries.
(Adding an equivalent -walkdepth primary would be okay: it would be
used as "-walkdepth 1", "-walkdepth -3", "-walkdepth +1", etc.)

-- 
Geoff Clare <g.clare@opengroup.org>
The Open Group, Thames Tower, Station Road, Reading, RG1 1LX, England

<Prev in Thread] Current Thread [Next in Thread>