IETF-SSH archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: IUTF8 pseudo-terminal mode



On Mon, May 12, 2008 at 02:00:56AM +0200, Vincent Lefevre wrote:
> On 2006-01-02 18:20:50 -0500, Jeffrey Hutzelman wrote:
> > One could argue that an SSH server running on such a system should look 
> > at the configured locale and configure the PTY appropriately, and that's  
> > probably even a good idea.
> 
> What "configured locale"? The user may use a locale which is not the
> default one at the system level. Perhaps you mean that the SSH client
> should propagate the locale (more precisely, the charmap) to the

Not more precisely. That's less precise on so many levels ;) Locale is
merely incomplete.

Locale and encoding are distinct, _particularly_ as regards Unicode, and
perhaps even for some ISO 2022 "character sets".

Strictly speaking, a character map has nothing to do with formatting, etc,
nor is it synonymous with encoding. UTF-8 is, I suppose, assumed over an
8-bit channel. But UTF-7 or some other scheme is equally possible; UTF-7 is
plausible, even. UTF-16 likely. Many systems will employ both UTF-8 and
UTF-16 together; UTF-16 (for historical misunderstandings which pertain to
this very discussion, but also practically because for Asian languages it's
slightly more efficient) is the internal encoding for most heavy Unicode
APIs--like Java, C#, and IBM's ICU. It doesn't make sense to assume
applications will always translate UTF-8 on both sides to/from UTF-16 over
the opaque data channel.

In practice, of course, the "locale" environment variable often specifies
both locale and encoding. But the distinction should be made, because I18N
continues to get botched by standards and implementation groups.




Home | Main Index | Thread Index | Old Index