Email List: Xaustin-group-lX
[All Lists]

Re: RE-ASSOC: a question about the associativity of RE concatenation

To: yyyyyyyyyyyyyy@xxxxxxxxxxxxx
Subject: Re: RE-ASSOC: a question about the associativity of RE concatenation
From: Paul Eggert <yyyyyy@xxxxxxxxxxx>
Date: Tue, 9 Apr 2002 16:40:56 -0700 (PDT)
Cc: yyyy@xxxxxxxxxxx, yyyyy@xxxxxxxxxxxxxx, yyyyy@xxxxxxxxxx
References: <200204091857.OAA35259@raptor.research.att.com> <200204092229.g39MTeX00802@shade.twinsun.com> <200204092254.PAA02436@morrowfield.home>
> Date: Tue, 9 Apr 2002 15:54:24 -0700 (PDT)
> From: Tom Lord <yyyy@xxxxxxxxxxx>

> I favor the left-associative interpretation because:
> 
>       1. It's what the grammar says.

Hard to argue with that.


>       2. ... I find left-associativity natural.

I briefly surveyed the net for whether left-associativity is more
common than right-associativity in RE concatenation.  I found many
examples of left-associativity.

  Scalable Simulation Framework (SSF) Regular Expressions
  http://www.ssfnet.org/SSFdocs/regexpPage.html

  A Java Package for Regular Expressions 
  http://www.sunsite.ubc.ca/LivingMathematics/V001N01/RegExp/regexp.html

  Pattern Matching with Regular Expressions in C++
  http://www.linuxgazette.com/issue27/mueller.html

  Lexical Analysis
  http://ironbark.bendigo.latrobe.edu.au/subjects/bitsys/clect/Lect02.html

In contrast I didn't find any example using right-associativity.
Admittedly I didn't look that hard; Google found about 1000 matches
and I looked only at a few dozen.

I did find one web page:

  Assignment #1 Solution: Regular Expression Grammar
  http://www.cs.sfu.ca/~cameron/Teaching/383/972/972s1.ps

that explicitly says that RE concatenation is left-associative, but it
could just as easily be right-associative, as that wouldn't affect
semantics.  (As we've seen, this equivalence does not hold for POSIX
RE semantics.)

So if common theoretical computer science tradition holds weight, we
should leave the POSIX RE grammar alone, and keep it left-associative.


>       3. It has the property that additional subexpressions
>          to the right, which don't extend the match, have
>          no effect on what is matched by the subexpressions
>          to the left.

Can you please elaborate on this a bit?  It seems to me RE-ASSOC(1a)
and RE-ASSOC(1b) both have this nice property, so it doesn't argue for
one interpretation over the other.

<Prev in Thread] Current Thread [Next in Thread>