Email List: Xaustin-group-lX
[All Lists]

Re: AI 2000-05-010: proposed interface

To: yyyyyyyyyyyy@xxxxxxxxxxxxx
Subject: Re: AI 2000-05-010: proposed interface
From: Antoine Leca <yyyyyyyyyyyy@xxxxxxxxxx>
Date: Wed, 23 Aug 2000 10:32:29 +0200
Organization: RENAULT (mais cette contribution est personnelle et n'engage pas RENAULT)
References: <200008161724.NAA0000001345@oflume.zk3.dec.com> <7kM4I54Hw-B@khms.westfalen.de>
Kai Henningsen wrote:
> 
> > I live on Planet Earth. A place where people speak different languages
> > and have different expectations about what any given range includes.
> > Not everyone is a Unix veteran who only uses the C locale.
> >
> > You have often said that you only have U.S. locales on the systems
> > available to you. Your systems may only exhibit 1970s and 1980s
> > behavior with respect to character handling, but most of us have
> > moved way beyond that.
> 
> Yes, most of us non-US types have been badly burned by [a-c] including
> upper case letters.
> 
> That doesn't mean it's right.

Well, the _right thing_ is what is already written in the Standard,
namely that "Range expressions shall not be used in portable applications
because their behaviour is dependent on the collating sequence".

 
> Actually, what is really needed, IMNSHO, is the ability to select for
> either C or national locale behaviour on a case-by-case basis.

It really depends on the individual. Being strongly accustomed to the
MS-DOS/Windows behaviour, I read this sub-thread with an amused eye.
Clearly for me, the "right" behaviour is to mix the upper-cased with the
lower-cased, I expect "[a-z]*" to catch all files that have _any_ letters
(which is not possible under Unix unless you cheat in some ways with
the order of the collation rules), I am strongly accustomed to see README
and Makefile being in the middle of the list when I type ls, and least
but not least, I never issue a somewhat dangerous command like "rm [a-c]*"
without first typing "ls [a-c]*" to see what will actually happen.
And yes, I've been burned. Once for sure, perhaps twice; not thrice.

Now, I am certainly not a good representative of the typical Unix user
(although I may be more representative of the typical Linux newbie ;-)).


> And only selecting between them via setting and unsetting LANG is a
> really bad interface.

I agree with you. If you cannot "forget" the POSIX locale behaviour (for
example, because you rely on habits like "vi [A-Z]*", and at the same time
want to have sensible locale settings for other collating tasks, that's
a problem.

But how many "other collating tasks which take sensible locale settings
in account" do you run?

> Both versions are actually necessary. 

This is where I am not sure I agree. What about

  set LANG=de_DE
  set LC_COLLATE=POSIX

plus perhaps some aliases for "sort", etc., as needed. "grep" will
probably require some ad-hoc script (that only turns the locale on when
-i is issued).


Antoine

<Prev in Thread] Current Thread [Next in Thread>