IETF-SSH archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: SFTP and unicode file names...



Joseph Galbraith <galb-list%vandyke.com@localhost> writes:

> I would definitely prefer to see the server do the translation
> when it can... that's why we went to UTF-8 in the first place.

I think there's one more use case that you need to consider, which I
expect is quite common:

The remote filesystem using the foo charset. The local system using
the same foo charset. Why do I think this is common? Because on both
sides, it's the same user's files, and the user is likely to use his
or her favourite charset (iso-8859-1, utf-8, euc-jis, whatever) on
most or all systems where he or she has an account.

What you call "raw mode" will work fine in this case, no matter if the
sftp implementation on server or client side knows about the foo
charset.

I like Jeffrey Hutzelman's proposal: Have two modes of operation, and
let the client select which mode it prefers,

 1. Server tells client the server's best guess as to what character
    set is used for filenames, and doesn't convert filenames in any
    way.

 2. All filenames on the wire are utf-8. Server converts filenames to
    and from utf-8 on a best effort basis, according to it's best
    guess of the actual charset. (What's the right thing to do if/when
    conversion fails, I don't know yet).

In both cases, server can tell client what charset is used on the
server side (or "unknown") at startup. Client can select mode of
operation either as a global protocol state, or a per request-flag. I
don't think I care very much if the mode selection is global or per
request; I'd expect most clients to choose one mode and stick to that.

The reasons I like this are:

 * It is fairly simple.

 * The client has the final say on whether or not the server should do
   any conversion.

 * The use case above is easily supported, without requiring any
   special server (or client) support for the obscure foo charset.

 * I don't see any use cases where the above scheme fails, and a more
   complicated scheme could do better.

Regards,
/Niels



Home | Main Index | Thread Index | Old Index