Re: wchar_t encoding?

To: tech-misc%netbsd.org@localhost
Subject: Re: wchar_t encoding?
From: uwe%stderr.spb.ru@localhost (Valeriy E. Ushakov)
Date: Thu, 20 May 2010 03:55:38 +0000 (UTC)

Paul Koning <Paul_Koning%dell.com@localhost> wrote:

> I'm working on a patch to gdb 7.1 to make it work on NetBSD.  The issue
> is that GDB 7 uses iconv to handle character strings, and uses wide
> chars internally so it can handle various non-ASCII scripts.
> 
> The trouble for NetBSD is that it asks iconv to translate to a character
> set named "wchar_t".  That means "whatever the encoding is for the
> wchar_t data type".  GNU libiconv supports that, so on platforms that
> use that library things are fine.
>
> The trouble is that I'm getting pushback on the patch, because of
> concerns that the encoding used for wchar_t is not actually UCS-4.
> In particular, there is this article:
> http://www.gnu.org/software/libunistring/manual/libunistring.html#The-wchar_005ft-mess
> which says that on Solaris and FreeBSD the encoding of wchar_t is
> "undocumented and locale dependent".  (Ye gods!)

Why are they so surprised about that?  C99 says:

       3.7.3
       [#1] wide character
       bit  representation  that fits in an object of type wchar_t,
       capable of representing any character in the current locale

It's simply impossible to always use unicode as the only encoding for
wchar_t, since not all charsets are 1:1 with unicode.

Besides, iconv does not return (fsvo "return") wide strings, it
returns good old pointer to char.  Do they pass a pointer to wchar_t
as destination?

If they just assume it's going to be a pointer to wide string, then
correct implementation of "wchar_t" is for iconv to convert to a plain
string in current charset and then convert that to a wide string.

Or do they actually assume it's gonna be utf32?

SY, Uwe
-- 
uwe%stderr.spb.ru@localhost                       |       Zu Grunde kommen
http://snark.ptc.spbu.ru/~uwe/          |       Ist zu Grunde gehen

Follow-Ups:
- RE: wchar_t encoding?
  - From: Paul Koning
- RE: wchar_t encoding?
  - From: Paul Koning

References:
- wchar_t encoding?
  - From: Paul Koning

Prev by Date: RE: wchar_t encoding?
Next by Date: RE: wchar_t encoding?
Previous by Thread: RE: wchar_t encoding?
Next by Thread: RE: wchar_t encoding?
Indexes:

Home | Main Index | Thread Index | Old Index