yyyyy@xxxxxxxxxxx (Joe Gwinn) wrote:
>
> >> >However, I'm professionally reluctant to disallow both "odd" (non power
> >> >of 2) and "large" (16 or more bit) bytes because we can't predict the
>future>> >very well. "Failure of imagination" is something that we need to be
>very
> >> >cautious about. Just because you or I can't imagine something (or can't
> >> >imagine it actually happening) doesn't mean it won't.
> >>
> >> The question is not if such things will ever happen, it's if they are
> >> likely to happen in the next five years. Notice that I said "likely" --
> >> perfect prediction is not required, which is fortunate.
> >
> >Firstly, as has been pointed out, this has already happened. The
> >question is when and if it will become widespread.
>
> What has already happened? I've already lost the thread.
What the previous poster said in the first paragraph I quoted. Systems
with characters that are not simply 8 bits.
> >Secondly, thinking in terms of a 5 year timescale is ridiculous.
> >There is code of mine in widespread use on hundreds of different
> >systems that I wrote 29 years ago. Those of us who write portable
> >code expect it to remain working with only minor changes for some
> >decades, and on systems that did not exist when we wrote it.
>
> I take this example to mean that we should stick to 8-bit bytes and forget
> about wide characters, something that cannot have been provided for in
> 29-year-old code, and thus is sure to break it.
The mind boggles! I didn't write code that remained portable for that
time by writing any such damn-fool assumptions into it! The way that
you get portability is NOT by binding such requirements into code, but
by writing the code such that it requires only the properties that it
genuinely needs.
Anyway, characters of other than 8 bits (even on 8-bit systems) were
widespread 30 years back - the modern generation has just grown up in
the delusion that their assumptions have always held.
> The five years is an IEEE requirement: standards must be revisited every
> five years or less, or the standard is administratively withdrawn for lack
> of interest. While we do our best to honor the existing base, if we change
> nothing, POSIX will ossify and die.
The fact that a standard must get reviewed every five years is NOT an
argument for claiming that its designers should not look ahead further
than five years. Failing to look ahead over several revisions is the
best way to make sure that each version is incompatible with the last,
and that (despite the official position) vendors and users stick to an
outdated version of the standard.
> >> I am not disallowing wide characters, I am only saying that they will
> >> consist of multiple contiguous bytes, and that we need to tease the concepy
> >> "byte" away from the concept "character", which heretofor had been more or
> >> less identical.
> >
> >I am certain that you haven't thought that one through. Changing the
> >C and POSIX interfaces to enable that is at least a hundred times as
> >difficult as changing the (very few, mainly networking) interfaces
> >that assume 8-bit bytes and powers of 2 sizes.
>
> I won't argue with your estimate except to say that it seems to me that
> changing something as fundamental as the width of a byte in an operating
> system that's at least 29 years old is not going to be so easy as it may
> appear, so my instinct is to look for ways to add wide characters without
> invading the core of UNIX. That said, my dark instinct can certainly be
> rebutted by even one working counterexample.
Changing from 8 bits is NOT as fundamental as you think, because there
have been Unices based on other widths. But that was not my point. The
assumption that byte = character is FAR, FAR more fundamental in both C
and POSIX than the assumption there are any particular number of bits in
a character. Changing to 16 bit bytes/characters is tedious and tricky,
but changing to 8 bits bytes and 16 bit characters is almost impossible.
> >Yes, you are right, and it was a mistake for C ever to confuse those
> >two concepts. But everybody who has thought about it seriously has
> >backed off attempting to change it.
>
> If I read the above correctly, you do not favor changing the width of the
> byte. But I thought you were arguing the contrary.
As I said, and as all people who have looked into the problem seriously,
it is changing the assumption that byte = character that is the almost
impossible task. Changing the width of bytes = characters is nearly
trivial by comparison.
> > ... You don't start with C and POSIX if you are going there.
>
> Going where? I've lost the thread.
bytes != characters.
Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QG, England.
Email: yyyy@xxxxxxxxx
Tel.: +44 1223 334761 Fax: +44 1223 334679
|