context:
BRE: /\(a\{2,3\}\)\1*/
input: aaaa
correct answer: \& = aaaa, \1 = aa
Daniel F. Savarese:
As I revisit POSIX, my reading is that it requires the leftmost
longest match, which makes "aaa" the correct result and the
glibc behavior unexpected.
Your reading is not correct. This point isn't even controversial.
Consistent with the whole match being the longest of the leftmost
matches,
each subpattern [....]
As in, "Consistent with not going overbudget, buy the best widget you
can." That does not mean "Buy the best widget money can buy, and this
is assrted to be consistent with not going overbudget." It means, the
primary constraint is the budget -- then within that constraint, buy
the best possible widget.
The spec you are quoting says: the correct answer is the longest
possible of the leftmost matches -- given that, here is how to
determine what subpatterns match.
The answer eggert gave is among the leftmost matches (since it
starts at position 0) and is certainly the longest (since it matches
the entire string). Thus, it is the correct answer.
Now, I'm no expert in the intended behavior of POSIX regular
expressions, but I am rather intimate with the intended
behavior of Perl regular expression behavior (also leftmost
longest)
Perl regular expressions are not leftmost-longest, at least in the
sense of that phrase as used among Posix implementors. Most people I
know use the phrase "first-match" to describe Perl-like regexps.
-t
|