Subject: Re: locale library (was Re: Back in June...)
To: None <tech-userlevel@netbsd.org>
From: T.SHIOZAKI <AoiMoe@imou.to>
List: port-alpha
Date: 01/08/2000 19:36:12
From: itojun@iijlab.net
Subject: Re: locale library (was Re: Back in June...)
Date: Sat, 08 Jan 2000 02:55:23 +0900
Message-ID: <10221.947267723@coconut.itojun.org>
> Soda-san, please add other details.
I touch on the background why I ported the FreeBSD runelocale to NetBSD.
In Far East Asia, users desired the mulibyte locale.
Especially, wcs/mbs conversion functions (such as wcstombs) is
very important for the practical use.
I think to make new locale framework for the *BSD platform,
but it will be finished far, and we should not let NetBSD locale stuff be
for a long time since the following reasons:
- It has been causing confusion in Japan to be unable to use multibyte
stuffs with the OS native locale of NetBSD, and it turn into a grave
issue on the X Window System environment. We Japanese must choose
one of the following ways:
- Do nothing. Thus, it means to give up to use multibyte environment.
But, it is too misery to use practically for us :-P
- Use X_LOCALE for the building X.
- Unofficial patch allows to libc to use rune stuff from 4.4BSD or
FreeBSD.
It is the fatal problem that each of the above ways do not have
the compatibility from each other.
- pkg/pkgsrc is appeared in the recent NetBSD. Such forms of the
distribution for applications require the unified system
to the background framework. To be lack of the multibyte locale
system causes to disturb to increase the variation of pkg/pkgsrc
for the CJK users.
- By the same reason, it is difficult that commercial vendors sell/port
their applications for the far-east market.
- We Japanese have already experienced terrible troubles in Linux
environments caused by the incompatibilities of locale between the
Japanese extended locale stuffs and the native glibc locale stuffs.
I want to disallow NetBSD to repeat such misery history :-)
So, I decided to port FreeBSD rune to NetBSD.
This project is called XPG4DL.
(The name of "XPG4DL" is derived from the way of work, thus I began
with that I made the rune of FreeBSD dynamic loadable on FreeBSD)
But, I thought that such porting must not cause the binary compatibility
problem and/or future fetters. Thus, I aimed to keep the binary
compatibility for the existing "singlebyte" NetBSD locale derived
from 4.3BSD and to think the future extensibility.
It is done as the following ways:
- I hide the rune APIs thoroughly from user programs.
Rune is not standard, not portable, and obsoleted.
We should not use rune APIs, and if we use rune explicitly,
then it will make fetters for the future extensibility.
Thus, I hide the rune APIs.
- I separated setlocale function into one for the compatible use
and new one.
The symbol name (embedded into libc) of compatible setlocale is
not changed. On the other hand, new setlocale has "__setlocale_mb"
as the symbol name, and it is renamed by using __RENAME macro.
If the compatible setlocale is called, then the locale library will
work on the "singlebyte mode". And if the new one is called, will
work on the "multibyte mode". These are important to keep binary
compatibility between singlebyte locale and multibyte locale.
For example, inconsistency of MB_LEN_MAX pointed out by Soda-san
can be avoided by this way, at least on the a.out platform.
Of course, it is not enough to keep backward binary compatibility
on the ELF platform, but I think it no problem since the same way is
already used on NetBSD, e.g. the extension of signal stuffs.
(BTW, I want the extended mechanism from the native ELF behaviour to
the ld/ld.so to embed/check the consistency of the minor revision of
shared library, such as a.out...)
- I made the glue for the NetBSD ctype.h macros instead of rewriting
ctype.h.
- I stay some features unimplemented intentionally.
Such features include wide-char file I/O, wide-char curses,
and others.
As the first step, it is important to make X Window System work
on the multibyte locale with the OS locale at once. And, most remained
features will make the big impact to other part of OS.
Additionally, #define macro version of wctype functions
(such as iswalpha), which is used to increase the perfomance of speed,
is also unimplemented.
Macro version of isw* make difficult to keep consistency across
the OS versions, and isw* is not frequently used than is* marcos.
Anyway, it seems better to phase in such features cafully.
I usually use libc including this extension with NetBSD/i386 current
Nov. 13 snapshot, it is well working and very stable on the X environment.
--
Takuya SHIOZAKI - Chair of IMOU.
The I18n/M17n project On Unix environments (IMOU), Japan.