| To: | yyyyyy@xxxxxxxxxxx |
|---|---|
| Subject: | Re: RE-ASSOC: a question about the associativity of RE concatenation |
| From: | Tom Lord <yyyy@xxxxxxxxxxx> |
| Date: | Fri, 12 Apr 2002 03:20:47 -0700 (PDT) |
| Cc: | yyyyyyyyyyyyyy@xxxxxxxxxxxxx, yyyyyyyyyyyyyy@xxxxxxxxxxxxx, yyyyy@xxxxxxxxxxxxxx, yyyyy@xxxxxxxxxx |
| References: | <200204091857.OAA35259@raptor.research.att.com> <200204092229.g39MTeX00802@shade.twinsun.com> <200204092254.PAA02436@morrowfield.home> <200204092340.g39Neua00876@shade.twinsun.com> |
eggert:
> From: Tom Lord <yyyy@xxxxxxxxxxx>
>
> The conclusions the committee reached regarding repeated nullable
> expressions create some new problems with regard to finding the
> longest overall match -- but we can get to those in a chat about the
> RE-ITERATE question.
Can you please elaborate on this a bit? I don't get the
connection between RE-ITERATE and the June 1995 minutes. I
assume you're talking about Interp #43, Part 12 "Can a
duplicated subexpression match the null string", but
unfortunately I don't understand all the consequences of that
part of the minutes.
The minutes say:
Can a duplicated subexpression match the null string? If so,
will the duplication be repeated until the expression does
match the null string? There was a consensus that applying a
duplicator to a RE that could match the empty string should be
unspecified.
That's just silly. A repeated nullable expression should match the
longest possible string that is consistent with the overall match. In
the case of a subsequent backreference to that nullable expression,
the empty match may be required to find the longest overall match,
even if the empty match is preceeded by 1 or more non-empty matches of
the repeated expression.
Example:
RE: (x?)*y\1z?
input: xxxxyz
Look at it as Hume suggested: the rules for building up expressions
are a set construction. For each expression, you get a possibly
infinite set of strings. The Kleene closure (* operator) of a set
that includes the empty string is perfectly well defined -- it
includes the empty string. Now generalize the Kleene closure to
handle backreferences. Unless you take a very awkward approach,
you'll permit extending a non-null repetition of a nullable expression
with the null string -- and then use that null string in subsequent
backreferences. So of course a repeated parenthesized nullable
expression can wind up mapping to an empty substring. It isn't
undefined at all.
-t
---
How valuable is my contribution? Share your feedback at
Affero:
http://svcs.affero.net/rm.php?r=tomlord
|
| Previous by Date: | Re: RE-ASSOC: a question about the associativity of RE concatenation, Tom Lord |
|---|---|
| Next by Date: | Re: RE-ASSOC: a question about the associativity of RE concatenation, evought |
| Previous by Thread: | Re: RE-ASSOC: a question about the associativity of RE concatenation, Tom Lord |
| Next by Thread: | Re: RE-ASSOC: a question about the associativity of RE concatenation, Tom Lord |
| Indexes: | [Date] [Thread] [All Lists] |