> Date: Tue, 9 Apr 2002 14:57:22 -0400 (EDT)
> From: David Korn <yyy@xxxxxxxxxxxxxxxx>
>
> /(week|wee)(night|knights)(s*)/
> Once the longest leftmost match of the complete string is found,
> "weeknights", which could be matched by or
> wee knights
> week night s
> The longest leftmost match of the leftmost parenthesised group is
> matched. This, the result is
> week night s
This example is consistent with the following interpretations:
* RE-CONCAT interpretation (1) plus RE-ASSOC interpretation (1b).
* RE-CONCAT interpretation (2) plus RE-ASSOC interpretation (2).
Since RE-CONCAT(2) is rejected (as described in my previous message),
I guess you must be intending RE-ASSOC(1b); or perhaps you intend some
other interpretation for RE-ASSOC that is consistent with the examples
in the June 1995 minutes. But, like Tom Lord, I don't see anything in
the June 1995 minutes that addresses this point.
If we interpret our example ERE /(week|wee)(night|knights)(s*)/ using
the terminology of the June 1995 minutes, RE-ASSOC(1b) requires that
/(night|knights)(s*)/ must be deemed to be "to the right of"
/(week|wee)/. But this is inconsistent with the POSIX grammar for
REs. The grammar says that the ERE's parse tree looks like this:
1
/ \
/ \
/ \
2 3
/ \ (s*)
/ \
/ \
/ \
/ \
/ \
/ \
4 5
(week|wee) (night|knights)
and, given this parse tree, Interp #43 Part 14 says that the
left-to-right order of these subpatterns is:
1, 2, 4, 5, 3
and this is equivalent to RE-ASSOC(1a), not (1b).
Here's another way of putting it. The POSIX grammar for EREs says
that /(night|knights)(s*)/ is not a subpattern of
/(week|wee)(night|knights)(s*)/, so it doesn't contribute to the ERE's
semantics.
However, I see that Andrew Hume writes that RE-ASSOC(1b) was the
intended meaning. If so, the simplest fix is to modify the POSIX
grammar so that RE concatenation is right-associative, as suggested at
the end of RE-ASSOC.
|