IETF-SSH archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: UTF8



der Mouse <mouse%Rodents.Montreal.QC.CA@localhost> writes:

>> As far as I understand, Unix /etc/passwd only support ASCII
>> usernames,
>
> I've seen this, or equivalent things, claimed before.  I finally tried
> it, and it's not true, at least not for the Unix variant I have at
> ready hand (five-year-old NetBSD).  I created a user whose name is 0xe5
> 0x67 0x65 (Latin-1 "åge" - I happen to know of someone named Åge).
> vipw did not complain.  I set its password; passwd did not complain.
> "su åge" worked.  "ssh localhost -l åge" worked, too, with an ssh
> that's ssh 1.2.14 in all username-processing respects.

That's a useful data point, thanks.

>> I'm not sure I see the problem.  Implementations that doesn't know
>> what charset their authentication database uses, will be limited to
>> ASCII, or whatever safe subset they can assume.
>
> Why?  Why should one octet-string system talking to another
> octet-string system be unable to use non-ASCII octets?

You shouldn't compare octet-strings with each other unless you know
which charset they were encoded in.

It doesn't follow that an ASCII string that is equal to an EBCDIC
string, octet-wise, mean the same thing.

In your example, an SECSH implementation would appear to need to know
what charset /etc/passwd is using.  In your example, it would be
ISO-8859-1.  This could be specified through a configuration file,
unless there is a system default.  Then, if SECSH used UTF-8, the
server would have to convert back and forth between UTF-8 and
ISO-8859-1 before comparing.

With this in place, your system could support Latin-1 usernames
reliably.

Try 'ssh -l åge remote-server' from a host using UTF-8 to realize why
the octet string view is not reliable.

Hope this helps,
Simon



Home | Main Index | Thread Index | Old Index