Email List: Xaustin-group-lX
[All Lists]

Re: Operators in 1003.1

To: "Schwarz, Konrad (CT)" <konrad.schwarz@xxxxxxxxxxx>
Subject: Re: Operators in 1003.1
From: "Rocky Bernstein" <rocky.bernstein@xxxxxxxxx>
Date: Mon, 30 Jun 2008 08:34:02 -0400
Cc: austin-group-l@xxxxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:cc:in-reply-to:mime-version:content-type :content-transfer-encoding:content-disposition:references; bh=84niX2WAPC0foyePqsfsAT8rvIR3k6NvwKw8tGU2Zzg=; b=svLFFWZFFdnNx5Hk6A3Wcrp505ZbeR1YnsOdt3xnSGRJr2D/wgEs7i5Rx5FzqE4QGV iQhbrYOaww16t36im5B9i9leRr2k3WxHuRW2RbN9lUcdJYDJsgQavmLYj8zZvL4LigdD 1eux4tzpBrgYRzodMqz7Kg0vtPENAt+G/dkI4=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references; b=pe+AQziTP7C6m4xnVrOgQhuTEjuHAQNLQ36ThqN4QeqiDuKPUJhXvpKQVcOSvDCpFk Vrslytc/VvPRVJsRmvW/tKFFjzzl4Kzx2DcBNRm61C6DCcbNeR4SaMN/OgSjnvWOx5LZ NHE9Ua+P3emJB6yQA7LSVI+VvM/sozWtIuESI=
References: <6cd6de210806272128t65dffff1t9d065ece5a1c9608@mail.gmail.com> <2A0150C01096FE48B4DEFC16A39E9C18026978B1@MCHP7RCA.ww002.siemens.net> <6cd6de210806300339y139aa4cbq670c607c8e69e722@mail.gmail.com> <2A0150C01096FE48B4DEFC16A39E9C18026979C2@MCHP7RCA.ww002.siemens.net>
On Mon, Jun 30, 2008 at 7:21 AM, Schwarz, Konrad (CT)
<konrad.schwarz@siemens.com> wrote:
>> From: Rocky Bernstein [mailto:rocky.bernstein@gmail.com]
>> Sent: Monday, June 30, 2008 12:39 PM
>> To: Schwarz, Konrad (CT)
>> Cc: austin-group-l@opengroup.org
>> Subject: Re: Operators in 1003.1
>>
>> On Mon, Jun 30, 2008 at 4:11 AM, Schwarz, Konrad (CT)
>
>> > Section 2.2, Quoting, lists the characters with special meaning.
>>
>> Yes, this is true. However it doesn't call those characters
>> "operators".  Are you implying that all of the characters in section
>> 2.2 with special meaning are operators?
>
> No.  However, in the part of your earlier mail that I deleted, you
> write:
>
>> Given the implication that in parsing these tokens need not have white
>> space around them, wouldn't it be helpful to have a list somewhere and
>> to more explicit what all of the token operators are?
>
> This would seem to be more or less the characters with special
> meaning---
> irrespective of whether they are used as operators or not.

When someone uses as phrase like "This would seem to be more or less"
regarding something that is a specification, it is a little
unsettling. Do you mean they are the ones that do not need white space
around them or do you mean you don't know?


>
> You further write:
>
>> Also, I note
>> that in some parts of the document these operators are further split
>> up into other categories like "arithmetic operator" (which is relevant
>> only in a particular arithmetic context?) "control operator" or
>> "redirection operator" perhaps "list operator".
>>
>> Again wouldn't be helpful to have these listed all in one place
> somewhere?
>
> I don't see how such a list would be of much benefit.  Either one is
> interested
> in the semantics of an operator; then a simple list does not really
> help.  Or one is
> interested in knowing which characters require quoting; this is defined
> in the
> section I mentioned above.

I'm interested in understanding when I write a program which
characters used in common constructs require white space around them
and which don't.

Generally reserved words '!' and '{' do, and  '(' and ';' do not. All
of them may need to be quoted or escaped in some contexts. I suppose
one might infer whether white space is needed from what gets quoted
and/or those things that are reserved words, but I find all of this a
little contorted. The specification isn't making it easy in my opinion
to answer this basic question that a programmer needs to know
involving how to write a program.

>
> On the other hand, I feel that the Shell document is large already;
> adding bulk
> does not necessarily add to its usefulness.

Well, another way to address is to change the sentence "the shell
breaks tokens into words and operators". This sentence sets up some
kind of expectation that a programmer is supposed to keep around the
concepts "word" and "operator" in addition to "token" because it has
some special implications. "word" does because words need to be
delimited some way, but what value does "operator" add here? Is this
just because some operators are more than one symbol in length? So
perhaps it's really the fact that some non-word tokens are more than
one character in length.

So how about "Some of the tokens are words and some of the non-word
tokens are more than one character long"?  And then a corresponding
change in section 2.3 to remove the reference to an operator.

>
> Additionally, I imagine that the various outgrowths of the Shell do not
> want to standardize
> on a fixed set of operators as this would limit their ability to evolve
> syntactically.

I think this shows why a statement like "tokens are split into words
and operators" may be a bit misleading, even though it may be
perfectly accurate (for now). But even here, I take that statement to
not mean they are exclusively split into words and operators,  because
there's this other class of special symbols like semicolon which I
guess from an interpreter's standpoint are tokens too.

Is + an operator? Well inside an arithmetic expression it is and
doesn't need any white space around it. But outside of that context,
not it's not an operator. So in expr 5 + 6 you need to have the space,
and although I understand + is an argument passed to expr the casual
shell programmer is probably going find this confusing.

Finally, this is how tokenization in shell languages is vastly
different then just about any other programming language. In other
programming languages, + is usually an operator *except* when it's not
in a string, part of a pattern, or in a comment. In shell + is *not*
an operator except when it is in an arithmetic expression.

>
> Regards,
> Konrad Schwarz
>

<Prev in Thread] Current Thread [Next in Thread>