| To: | wollman+austin-group@xxxxxxxxxxx, albert@xxxxxxxxxxxxxxxxxxxxx |
|---|---|
| Subject: | Re: multibyte C locale |
| From: | Joerg.Schilling@xxxxxxxxxxxxxxxxxxx (Joerg Schilling) |
| Date: | Mon, 02 Nov 2009 17:15:01 +0100 |
| Cc: | Glen.Seeds@xxxxxxxxxx, austin-group-l@xxxxxxxxxxxxx |
| References: | <4AE83F49.1090805@byu.net><20091030093610.GA31100@squonk.masqnet><20091030131925.GH28296@prunille.vinc17.org><20091030140706.GA12871@squonk.masqnet><20091030181139.GI28296@prunille.vinc17.org><8CC27A9B7A337AB-1BD0-9799@webmail-d053.sysops.aol.com><4AEB5378.5070402@byu.net><19179.21604.450124.747123@khavrinen.csail.mit.edu><OF7977394B.18C83951-ON8525765F.0077969C-8525765F.0077A6A5@ca.ibm.com><19179.25274.524418.169871@khavrinen.csail.mit.edu><787b0d920910310310x71642ee3i19b7e233e6c64696@mail.gmail.com> |
Albert Cahalan <albert@users.sourceforge.net> wrote:
> C.UTF-8 would be damn nice. Dealing with UTF-8 text is rather
> important these days. Unfortunately, locales like en_US.UTF-8
> get all stupid with collating order. 'a' comes after 'Z' damn it!
> (that is, U+0061 is a bigger number than U+005A) Possibly
> there is some hack with multiple locale variables that will make
> things sane, but that's excessively painful for such a common case.
>
> I expect the plain C locale to cover U+0000 through U+00FF,
> but it does not. At least with glibc, stuff like U+00E0 fails the
> isalpha() test. Ouch, this is broken too.
It would make sense to asume ISO8859-1 for the extended charset in a
single char 8 bit C locale as UNICODE has become the standard of the
future and as ISO8859-1 is identical to the values 0 .. 255 from UNICODE.
Jörg
--
EMail:joerg@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
js@cs.tu-berlin.de (uni)
joerg.schilling@fokus.fraunhofer.de (work) Blog:
http://schily.blogspot.com/
URL: http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily
|
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| ||
| Previous by Date: | Re: multibyte C locale, Joerg Schilling |
|---|---|
| Next by Date: | Re: multibyte C locale, Joerg Schilling |
| Previous by Thread: | Re: multibyte C locale, Joerg Schilling |
| Next by Thread: | Re: multibyte C locale, Joerg Schilling |
| Indexes: | [Date] [Thread] [All Lists] |