| To: | yyyyyyyyyyyyyy@xxxxxxxxxxxxx |
|---|---|
| Subject: | Re: Re: RE_CONCAT: question about RE concatenation and subpattern matching |
| From: | David Korn <yyy@xxxxxxxxxxxxxxxx> |
| Date: | Tue, 9 Apr 2002 18:02:03 -0400 (EDT) |
| Cc: | yyy@xxxxxxxxxxxxxxxx |
I think that the main problem stems from the definition of subexpression.
In section 9.3.6 on page 172 of the new standard, item 2 it
defines a subexpression as the characters between \( and \) for a BRE
( for ERE it would be (...) ).
For ((week|wee)(night|knights))(s*)
The subexpressions are
1. ((week|wee)(night|knights))
2. (week|wee)
3. (night|knights)
4. (s*)
in that order. What the RE group decided was that the lefmost
longest rule would be applied first to the outer level
subexpressons, 1 and 4, so that for this case
\1 would be weekknights
\4 empty
Then, \2 and \3 would be computed by looking at all the ways
that
((week|wee)(night|knights))
could match weeknights and choosing the leftmost longest of (week|wee).
However, the only match is
wee knights
so that \2 is wee and \3 is knights.
I hope that the will clarify the intent at least.
David Korn
research!dgk
yyy@xxxxxxxxxxxxxxxx
|
| Previous by Date: | Re: RE-ASSOC: a question about the associativity of RE concatenation, Paul Eggert |
|---|---|
| Next by Date: | Re: RE_CONCAT: question about RE concatenation and subpattern matching, Paul Eggert |
| Previous by Thread: | Re: RE-ITERATE: a question about RE iteration and subpattern matching, Paul Eggert |
| Next by Thread: | Re: RE_CONCAT: question about RE concatenation and subpattern matching, Paul Eggert |
| Indexes: | [Date] [Thread] [All Lists] |