Email List: Xaustin-group-lX
[All Lists]

Re: RE-ASSOC: a question about the associativity of RE concatenation

To: yyyyyyyyyyyyyy@xxxxxxxxxxxxx
Subject: Re: RE-ASSOC: a question about the associativity of RE concatenation
From: "Daniel F. Savarese" <yyy@xxxxxxxxxxxx>
Date: Fri, 12 Apr 2002 15:18:57 -0400
In message <yyyyyyyyyyyyyyyyyyyyy@xxxxxxxxxxxxxxxx>, Tom Lord writes:
>Left-to-right maximization of subexpressions practically demands left
>associativity.

At the risk of putting my foot in my mouth again, this is exactly right
when you use a leftmost derivation.  You've all agreed on leftmost longest
matching and seem to be looking to so-called classical regular expression
behavior for guidance.  Classically, leftmost means a leftmost derivation
(see Aho, Sethi, and Ullman's "Compilers: Principles, Techniques, and Tools").
When I look at the grammar in the standard, a leftmost derivation yields left
associative concatenation and the parse tree for 1a.

Using Paul's ABC example (paretheses omitted) and the following grammar
rule
  RE_expression : simple_RE | RE_expression simple_RE
you wind up with the following leftmost derivation (i.e., where the leftmost
nonterminal is always expanded first):
  RE_expression => RE_expression simple_RE =>
  RE_expression smiple_RE simple_RE => simple_RE smiple_RE simple_RE =>
  A simple_RE simple_RE => A B simple_RE => A B C
If I'm not mistaken this corresponds to 1a.

A rightmost derivation would yield:
  RE_expression => RE_expression simple_RE =>
  RE_expression C => RE_expression simple_RE C => RE_expression B C
  simple_RE B C => A B C
This would correspond to 1b.

I know you've already discussed what the grammar says versus additional
semantic information in accompanying words, so the above is not news
to anyone.  However, it is important that the accompanying words are clear.
If you go a different way than 1a, please strike the leftmost longest language
from the standard because it will be misusing leftmost the same way I was
misusing it (having been too contaminated with Perlisms and using leftmost
to refer to the mental NFA matching process rather than the derivation).
My principal concern in all of this is not what the behavior should be, but
that once you've decided what it should be, that the resulting language is
clear and we don't wind up back where we started.

daniel


<Prev in Thread] Current Thread [Next in Thread>