tech-userlevel: Re: utf-8 and userland

Subject: Re: utf-8 and userland
To: Bill Studenmund <wrstuden@netbsd.org>
From: Wolfgang S. Rupprecht <wolfgang+gnus20040312T095618@dailyplanet.dontspam.wsrcc.com>
List: tech-userlevel
Date: 03/12/2004 13:23:09

Bill Studenmund writes:
> I think that'd be cool. Though I'm not sure if LC_LANG is the right place 
> to look. Wouldn't nl_langinfo(CODESET) be the right thing to look at?

Oops.  So much for me double checking my environment variables before
posting.  The env setting that the uxterm wrapper does is
"LC_CTYPE=en_US.UTF-8".  That does indeed pop out of
nl_langinfo(CODESET) as "UTF-8".  Is uxterm doing the right thing by
setting LC_CYTPE that way?

I wonder if there is already a table of tables listing the chars that
can safely be output for each codeset.  (Or is it sufficient to simply
let anything that isn't a control char to pass unmolested when any
codeset is explicitly set by the user?)

-wolfgang