| To: | yyyyyyyyyyyyyyy@xxxxxxxxxxxxx |
|---|---|
| Subject: | Comments on reg exp standard |
| From: | "Jon Hitchcock" <yyyyyyyyyyyy@xxxxxxxxxxx> |
| Date: | Tue, 28 May 2002 15:51:23 +0100 |
|
First, I will explain that I am not an expert on regular expressions. While reviewing the proposed changes (ERNs 17, 18, 19), there were some places where I found it hard to understand the existing standard. I think the standard should aim to be clear, not just to regular expression experts, but also to people like me. Lines numbers below refer to the published Base Definitions volume. Some of my comments relate to ERN 17, and I assume that the proposed definition of "subexpression" in ERN 18 is accepted, so that the term is defined for both BREs and EREs. ------------------------------------------------------------------------ Line 5908: Is it necessary to say "For this purpose, a null string shall be considered to be longer than no match at all"? The paragraph is about choosing one of a number of matches, so something which does not match is irrelevant. If some subtle point is being made, it would help to have an example giving the different results with and without this rule. Line 5908: ERN 17 suggests adding "An enclosed subpattern is deemed to be to the right of an enclosing pattern." To me, this seems contrived, and it would be more natural to have a recursive description saying that, where subexpressions are nested, the rule is applied first to the whole expression and then to the subexpressions. An alternative view is that the proposed wording is consise and precise, and that an explanation can be put in the rationale. Line 6095: ERN 17 suggests changing "whatever" to "any string". Does this make a difference? What else does a subexpression match apart from a string. If some subtle point is being made, an example would help. Lines 6105-6109: The two sentences ["When the referenced subexpression matched more than one string, the back-referenced expression shall refer to the last matched string. If the subexpression referenced by the back-reference matches more than one string because of an asterisk (’*’) or an interval expression (see item (5)), the back-reference shall match the last (rightmost) of these strings."] seem to say the same thing in different words. If this is so, I suggest removing the first sentence to avoid any doubt that its meaning is subtly different. Line 6137: The same precedence rules apply when subexpressions and back-references are duplicated. I suggest changing "Single-character-BRE duplication" to just "Duplication". Line 6162: I am sure I am not the only person who has been misled by the word "Extended" in ERE. I suggest adding: Note: The specification for EREs is not purely an extension of that for BREs. Back-references are not available, and the notation for subexpressions and interval expressions is different. Line 6209: As for line 6095 ("whatever"). Line 6253: The term "Grouping" is used nowhere else. I suggest changing it to "Subexpressions". Line 6254: The same precedence rules apply when EREs enclosed in parentheses are duplicated. I suggest changing "Single-character-ERE duplication" to just "Duplication". _________________________________________________________________ Join the world’s largest e-mail service with MSN Hotmail. http://www.hotmail.com |
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| ||
| Previous by Date: | Re: Re: starting point, David Korn |
|---|---|
| Next by Date: | Re: Comments on reg exp standard, Paul Eggert |
| Previous by Thread: | starting point, David Korn |
| Next by Thread: | Re: Comments on reg exp standard, Paul Eggert |
| Indexes: | [Date] [Thread] [All Lists] |