Subject: Re: Unicode support in iso9660.
To: Jason Thorpe <thorpej@shagadelic.org>
From: Reinoud Zandijk <reinoud@netbsd.org>
List: tech-kern
Date: 11/23/2004 13:06:57
--E39vaYmALEf/7YXx
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Dear folks,
On Fri, Nov 19, 2004 at 07:44:46AM -0800, Jason Thorpe wrote:
> On Nov 17, 2004, at 7:57 PM, MINOURA Makoto wrote:
>
> > - Mountpoints
> > (/<Russion dirname>/<Japanese dirname>/<German filename>,
> > but this could not be accessible from processes with
> > LC_ALL=de_DE.ISO8859-15 for example)
>
> I think this could be handled if UTF-8 were the standard encoding for
> userland<->kernel interaction, yes?
It would handle it fine yes. I thus think that UTF-8 (wich supports upto
32+ bits chars) would be fine for this.
For current installations transition might be a bit tricky but on the other
hand, providing a simple `dont translate' flag to mount will fix this too
since the users on such a system aparently have found a way/procedure to
work with it wich will then not have to be changed...
Newly formatted filingsystems can be filled with whatever UTF-8 allows.
Thus the example above "/<Russion dirname>/<Japanese dirname>/<German
filename>" will be encoded in UTF-8 on disc and be fully accessible and
readable given a good font-set :-)
When copying stuff from say an old disc to the new disc, filenames can be
translated acording to the current LC setting; i.e. set the LC to `russian'
encoding and copy the russian filenames, set the LC to `chinese' and copy
the chinese filenames.... etc.
When copying stuff from say ISO9660, UDF or NTFS filingsystems who do have
a notion of `encoding' the filingsystems can translate to/from UTF-8 before
leaving the filingsystem.
> My feeling is that the convergence point should be "UTF-8 at the system
> call layer", i.e. userland gives UTF-8 names to the kernel, the kernel
> gives UTF-8 names to userland. It would then be the responsibility of
> the individual applications/system libraries/kernel subsystems to do
> whatever translation to/from UTF-8 is required.
Looks good.
Cheers,
Reinoud
--E39vaYmALEf/7YXx
Content-Type: application/pgp-signature
Content-Disposition: inline
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (NetBSD)
iQEVAwUBQaMn14KcNwBDyKpoAQJ23ggAw8RDT5MlFGpI/N6HrVnJJ0Cxxtp5BZ+9
XmZ0Xn+ZUwKnxge8ruYPj6jmhXwRv1oqma3FD1YvUwdOrhJYvPyVJk/2HHRkvWV/
+D36tGAtpeXB1Iizv4g7rQOlqC3gz9cadYIBRladAoSQzw6M5HkP4RVn2FMl4biZ
90HuyYaNSSGbQZopn86w2hOtWGjxB3S9JSh6ovN777iCxbTmTIrCUdtVQ8yNb1a4
dLaO+9e1vyR4flNYGxFAnN0oWq/ByDhQKXFCtYaUF6lEUb1/fV6EsULcqRS+BEyp
BCcdRwS5VBxE21LUlVVv3ZV2VD3gMxbkRrIntA3USF30mv0lcYJe8Q==
=LFcQ
-----END PGP SIGNATURE-----
--E39vaYmALEf/7YXx--