I believe that this issue was resolved in the June 1995 POSIX RE
experts meeting in Toronto which I chaired. I have enclose the
minutes below.
Looking over the minutes, I saw nothing that actually addresses this
issue. Can you be more specific? It looks to me like the experts
meeting didn't address the RE-ASSOC issue at all.
You gave two explanations. The first was:
Once the longest leftmost match of the complete string is found,
"weeknights",
which could be matched by or
wee knights
week night s
The longest leftmost match of the leftmost parenthesised group is
matched. This, the result is
week night s
That could mean either that, for the purpose determining the meaning
of "subpattern" in E.2.8.2., concatenation is considered to be right
associative, or it could mean that "subpattern" always means (only)
parenthesised subexpressions, or it could mean something more
complicated and even less obviously related to the language of the
spec.
For reasons previously mentioned in this thread, I am certain that
"subpattern" is best explained by a _left-associative_ interpretation
of concatenation and, in general, by the grammar. Thus, the
match should be:
wee knights
Interestingly, the grammar-oriented interpretation makes the added
language about nested subexpressions redundant, though perhaps
(indirectly) clarifying.
For other reasons previously mentioned in this thread, I think the
right-associative interpretation is undesirable.
The "parenthesised expressions only" interpretation is, in my
experience, a bad idea -- not only is it harder to implement, but it
confuses users when adding apparently innocent parentheses to an
expression leads to a different matching behavior.
Other more complicated interpretations of "subpattern" are
interesting, but I don't think they have anything to do with the spec.
You also said:
Notice that nested subexpressions are to be considered to the
right of the expression that it nests.
I'm not sure why you mention that.
The conclusions the committee reached regarding repeated nullable
expressions create some new problems with regard to finding the
longest overall match -- but we can get to those in a chat about
the RE-ITERATE question.
-t
|