On Fri, Apr 05, 2002 at 03:37:31PM -0600, Mark Brown wrote:
> RE-ASSOC: a question about the associativity of RE concatenation
>
> For example, suppose we match the ERE /(week|wee)(night|knights)(s*)/
> to the string "weeknights", and want to know which subpatterns match
> which substrings. Here are some possible interpretations:
>
> (1a) The "subpatterns" are the two immediate subexpressions,
> interpreted according to the grammar, namely
> /(week|wee)(night|knights)/ and /(s*). The longest consistent
> match for /(week|wee)(night|knights)/ is "weeknights".
> Therefore, /(s*)/ matches the empty string.
As far as I can tell, and this is also what the grammar
indicates, the regexp is done left to right, finding
for each component the longest match.
For the 3 components mentioned then the first (leftmost) component
finds "week" and the 2nd component then needs to work on the
remainder of the substring. There only "night" matches.
I do not believe that regexp tries to find the combined
longest match of the two components, as you seem to indicate.
Thus this should work as you argue is most user intuitive
and getting the results as per your example 1b, which
I also agree is the most desired result.
Best regards
keld
|