From: Donn Terry <yyyyyy@xxxxxxxxxxxxx>
Date: Mon, 14 Aug 2000 14:25:11 -0700
(Without expressing an opinion on the topic either way) it appears
to me that theres an issue revolving around how the interface will
be used.
Here's one example of how the interface might be used. For speed, GNU
grep builds a DFA that normally operates in place of the normal
regular expression library. Currently, GNU grep uses strcoll to
implement this DFA, but that confuses collation order with collation
sequence and it therefore does not conform to POSIX.2. There's no
portable way to fix this bug with the current POSIX API, other than by
falling back on the POSIX regular expression library whenever matching
encounters a range expression, which would hurt performance
significantly.
Now, let's see how we can fix GNU grep using the two different
interfaces proposed in this thread. Let's assume a simple
8-bit-character model; this is unrealistic, but it simplifies the
example code below.
With strseq, I might write something like this. The "Compile" step
would be used when building the DFA, and the "Match" step would be
used when applying the DFA to the input stream.
/* Compile a range expression [l-u]. */
char lo[2];
char hi[2];
lo[0] = l; lo[1] = '\0';
hi[1] = u; hi[1] = '\0';
/* Match the range expression against the character c. */
char ch[2];
ch[0] = c; ch[1] = '\0';
return strseq (lo, ch) <= 0 && strseq (ch, hi) <= 0;
With colseq, I might write this instead:
/* Compile a range expression [l-u]. */
int lo = colseq (l);
int hi = colseq (u);
/* Match the range expression against the character c. */
int ch = colseq (c);
return lo <= ch && ch <= hi;
For this kind of example, colseq is better: the code is easier to read
and understand, and it undoubtedly matches faster. But this is just
one example, and an oversimplified one at that.
Perhaps someone who sees the advantages of strseq can give an example
where it is a better interface.
|