Email List: Xaustin-review-lX
[All Lists]

Re: Defect in XCU iconv

To: yyyyyyy@xxxxxxxxxx
Subject: Re: Defect in XCU iconv
From: Joanna Farley <yyyyyyyyyyyyy@xxxxxxx>
Date: Sat, 13 Apr 2002 11:47:56 +0100
Cc: yyyyyyyyyyyyyyy@xxxxxxxxxxxxx
References: <200204120407.g3C47e35281047@xxxxxx>
Ulrich,

I've asked for additional opinion including our I18N expert and 
we have the following comments.


Ulrich Drepper wrote:
> 
> I generally agree with your first point but not with the second:
> 
> >
> > Change the SYNOPSIS in line 19270 to
> >         iconv [-cs] -f frommap -t tomap [file ...]
> >         iconf [-cs] [-f fromcode] [-t tocode] [file ...]
> 
> Is this really equivalent to what the sentence on line 19306f says?
> There it says that other combinations are undefined.  This syntax could
> be interpreted that they are invalid.  Does this make a difference?
The wording on 19306f says that the forms shown in the modified
synopsis lines above produce defined results and other forms
produce undefined results.  Therefore, conforming applications
can't use other forms.  There is no requirement that
implementations reject other synopsis forms.  (Implementations
are allowed to define extensions that define behavior that is
not defined by the standard.)


> 
> I'm not sure whether it is the best idea to introduce the new *map
> names.  Look at your proposed changes:
> 
> > Replace lines 19288 to 19296 with:
> >
> > -f  fromcode
> >       Identify the codeset of the input file. If the option-argument
> > does not contain a slash,
> 
> This would have to be written like
> 
>   -f  fromcode
>       Identify the codeset of the input file.  The option-argument must
>       not contain a slash character. [...]
Yes.

> 
> and then later
> 
>   -f  frommap
>       Identify the codeset of the input file.  The option-argument must
>       contain a slash character. [...]
Yes.

> 
> I find this rather awkward.  I realize that you want to distinguish
> between tocode and tomap to handle your second inconsistency.  But since
> I disagree with your second point I don't see the need.
We reworded it to try and make the use clearer rather than to handle the
second point.

> 
> Instead I think the parameter names should be renamed to "fromname" and
> "toname".  The -c description should be changed to
> 
>  -c  Omit any invalid characters from the output. When -c is not used,
>      the results of encountering invalid characters in the input stream
>      (either those that are not valid members of the charmap or codeset
>      named by /fromname/ or those that have no corresponding value in
>      charmap or codeset named by /tocode/) shall be specified in the
>      system documentation. The presence or absence of -c shall not
>      affect the exit status of iconv.


> 
> Similarly for -s.  The other options only change as in
> s/\(from\|to\|)code/\1name/.
> 
> Now to your second inconsistency.  I don't think it really is one.  Yet,
> the iconv utility implementation does provide a feature the interface
> doesn't have.  This does not mean that there is something wrong.  There
> is no requirement for that anywhere.  Utilities can certainly require
> functionality which isn't in the standardized interfaces.
We believe there is an inconsistency between the icon utility and the
historical practice of using the iconv() system interface used to
implement the utility. The rationale for the icon utility itself
mentions the implementation defined behavior of valid values on lines
19363-19364. "The valid values for fromcode and tocode are 
implementation defined" 


> 
> This means that, if you want to implement the iconv utility with the
> iconv interface, your iconv implementation has to have some non-standard
> feature.  That's how I implemented it.
> 
> The reason why I don't want to see the change you proposed is that is
> severely reduces the usefulness of the -c option (BTW you didn't want to
> change the -s option?) 
The changes should be applied so that the -c and -s description are 
consistent. 
> I for once know that users are looking for the
> definitive reaction if the iconv utility in such a situation.  There are
> wrong-encoded files and leaving it up to the implementation to decide
> how the utility reacts prevents writing portable scripts.  It would be,
> for instance, possible to abort the conversion altogether.  This is what
> my implementation does in the absence of this flag.

The rationale 19360-19362, at page 505, in XCU6, says as follows
"The icon utility can be used portably only when the user provides two
charmap files as option arguments."

So, when the user specifies non-charmap arguments, the iconv utility
cannot be depended on to be portable. 

We understand and appreciate the behavior you are providing. However,
the icon module developers in Sun have developed their icon modules so
that those conversions keep going as much as possible, for example, by
replacing a non identical character encountered in the input with an
alternative valid character. We believe the definition of iconv() 
function allows both behaviors and that the the intent was that the 
icon utility should as well given the implementation defined nature 
of the operations on non-charmap arguments.

This second issue will most likely not be able to be addressed in TC and
more likely will require an AGR interpretation.

Joanna

<Prev in Thread] Current Thread [Next in Thread>