IETF-SSH archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: UTF8



Niels Möller wrote:
der Mouse <mouse%Rodents.Montreal.QC.CA@localhost> writes:
>
I'm faced with password hashing routines that work with
octet strings, not character strings; etc.

Am I required to reject attempted non-ASCII
strings in these places for no reason other than an inability to know
what the user intended the character set - if any - to be?  (For that
matter, what grounds are there for assuming that octets in the ASCII
range are intended to correspond to ASCII characters, rather than, say,
KOI-7?)

I'm assuming you're talking about the server implementation now
(client side is comparatively trivial; convert input to utf8 based on
the current $LC_CTYPE). On the server side, problem is that at login
time, you don't know the user's $LC_CTYPE. My recommendation is as
follows:

1. Chose one default encoding (be that plain ascii, or latin1, or
   koi-7, or normalized utf-8, depending on your context and
   preference).

2. Provide an option for the sysadmin to say that on his or her
   particular system, some other character set is used for user names
   and passwords.

Then convert the usernames and passwords you get on the wire to the
selected encoding. That's almost solves the problem, and it's no big
deal.

Precisely.  By doing this you increase interoperatability from
only those systems that use the same character sets (i.e., only
koi-7 system interoperate with each other) to interoperating in
with all clients, as long as the same character set is used
on the server for passwords.

If this is too restrictive (i.e., different users on the same
server use different character sets for their passwords), do
this:

Optionally, to support systems where different users use different
character sets for their usernames and/or passwords, use some per user
configuration or kludgery to figure out the user's character set.

For example, a non-script dot-file in the users home directory
that you can read to get such useful information as $LC_TYPE,
preferred umask, etc.  (Things you'd really like to know, but
don't want to run a user script to find out.)

Given how common such systems are, it seems a bit odd that the IETF
would take a position so apparently incompatible with them.

Do you have some numbers to back that up? I've seen quite some number
of unix systems, but as far as I can recall, I've *never* seen one
where usernames and passwords used non-ascii characters. (I *have*
seen plenty of non-ascii filenames, but as I said, that's a different
issue, and irrelevant to the core drafts). I live in latin1-land, not
asia, though.

I will say that windows can and does use non-ascii usernames and
passwords, and it is not an uncommon operating system, though it
is not the most common of server platforms.

- Joseph



Home | Main Index | Thread Index | Old Index