Subject: Re: Return from nl_langinfo(CODESET)--any standard?
To: James K. Lowden <jklowden@schemamania.org>
From: Noriyuki Soda <soda@sra.co.jp>
List: tech-userlevel
Date: 01/26/2003 07:31:28
>>>>> On Sat, 25 Jan 2003 16:46:32 -0500,
"James K. Lowden" <jklowden@schemamania.org> said:
> So, is there any reason why our nl_langinfo() doesn't return an
> "official" name, such as one from
> http://www.iana.org/assignments/character-sets ?
Using IANA name as locale codeset name was once discussed on
bsd-locale mailing list. And I was one who opposed to it.
The reasons are:
1. Portable program shouldn't assume that an IANA charset name can be
usable as a locale codeset name. Only Linux supports it.
Rather, we should call an intermediate library function to convert
an IANA charset name to a locale codeset name.
So, permitting IANA charset name extends the bad coding style
influence...
2. Some MIME charsets include "_" or "." in its charset name.
But a locale codeset name shouldn't include such character to avoid
ambiguity to parse locale name. Note that all locale name should
have "language[_TERRITORY[.codeset]]" syntax.
3. As Klaus said, IANA charset name has many aliases for one charset.
So, using IANA charset name as a locale codeset name increases
complexisty of programs. Note that there are IANA charsets which
have multiple aliases without having any prefered MIME name.
4. It's ugry to support a locale name like
ja_JP.Extended_UNIX_Code_Packed_Format_for_Japanese
But, please note apparently we should support MIME and IANA name
at some levels. What I objected was only to use the IANA name as
codeset name in our locale library.
Probably we should have a library function to to support the converion
between IANA charset names and supported codeset name.
> I'd be very interested to hear what's happening with NetBSD's locale
> support. I have a direct interest: I maintain a database library in which
> we're improving our locale support, and I can't set up an ISO-8859-15
> environment on my favorite operating system.
Hmm. Is there any problem other than the GNU iconv thing
(and lacking of intree iconv(3) and working strcoll(3)/strxfrm(3))?
--
soda