Email List: Xaustin-group-lX
[All Lists]

Re: Here-documents in $(...)

To: austin-group-l@xxxxxxxxxxxxx
Subject: Re: Here-documents in $(...)
From: Geoff Clare <gwc@xxxxxxxxxxxxx>
Date: Thu, 28 May 2009 10:21:26 +0100
References: <4A15E71C.3070202@sonic.net> <200905261440.n4QEep5H018582@penguin.research.att.com> <20090527095232.GA18259@squonk.masqnet> <4A1D25C9.5000907@byu.net> <20090527143710.GB28819@squonk.masqnet> <4A1D9050.8060008@byu.net>
Eric Blake <ebb9@byu.net> wrote, on 27 May 2009:
>
> According to Geoff Clare on 5/27/2009 8:37 AM:
> > Eric Blake <ebb9@byu.net> wrote, on 27 May 2009:
> >> It would be worth having the standard specify whether both of
> >> these examples are well-behaved (and print hi) or syntax errors:
> >>
> >> echo $(
> >> cat <<\)
> >> hi
> >> )
> >> )
> >>
> >> echo $(
> >> cat <<EOF
> >> hi
> >> EOF)
> > 
> > The standard is already clear that both of these must print hi.
> 
> I disagree - the standard as it currently stands is only clear that the
> first must print hi:
> 
> cat <<\)
> hi
> )
> 
> is a well-formed here-doc.
> 
> But the second is questionable: the current documentation for here-docs
> states "The here-document shall be treated as a single word that begins
> after the next <newline> and continues until there is a line containing
> only the delimiter and a <newline>".  Since there is no newline after the
> EOF token, one could argue that the here-doc is unterminated, and that the
> enclosing $( has no matching ), and that this is a syntax error rather
> than printing hi.

Ouch.  There is a bigger problem here.  The standard requires shell
scripts to be text files (but with no line length limit), so the
requirement that the text inside $(...) must be "any valid shell
script" means that it must end with a newline (unless it is empty),
and the behaviour of simple things like $(echo foo) is unspecified.

We could fix that by changing "Any valid shell script" to "Any valid
shell script (with or without a terminating <newline>)".

> My concern is that if you change here-docs to allow end-of-input to be a
> valid terminator, then the first becomes ambiguous - is it a command
> substitution of:
> 
> cat <<\)
> hi
> END_OF_INPUT
> 
> followed by an unmatched ')', or does the command substitution not end
> until the second ')'?

END_OF_INPUT would only be considered a valid terminator for
complete_command if it follows a syntactically correct "list" (as
defined in the grammar).  Thus the parser could not finish parsing
the above as a complete_command after the "hi" but would continue
to include the delimiter.

> On the other hand, by allowing end-of-file to
> terminate a here-doc, then the second example is no longer a syntax error
> - - the line with "EOF)" is treated as the here-doc terminator, end of
> input, then the closing ')' for the command substitution.
> 
> Meanwhile, most shells that I tested output 'hi' for this case (showing
> that end-of-input is already special-cased for here-docs in many shells),
> although posh complained:
> 
> $ bash -c 'cat <<EOF
> hi'
> hi
> $ posh -c 'cat <<EOF
> hi'
> posh: here document `EOF' unclosed

Since the input is syntactically incorrect according to the standard's
grammar, the shell is free to either report an error or execute
something of its choosing (as an extension).

-- 
Geoff Clare <g.clare@opengroup.org>
The Open Group, Thames Tower, Station Road, Reading, RG1 1LX, England

<Prev in Thread] Current Thread [Next in Thread>