IETF-SSH archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

RE: UTF8 in SFTP (was: solving the SFTP text mode issue)



I believe that a text transfer mode is sufficient to meet our customers'
needs.  Users view SCP & SFTP as replacements for FTP and hence expect to be
able to transfer text files as well as binary files.  Though FTP defines a
record transfer mode it is seldom used, so I don't think that the
development of the SSH File Transfer Protocol should be encumbered with the
work that would need to be done if a record transfer mechanism were to be
added at this time.

The text transfer mechanism in the SSH File Transfer Protocol should define
a single method of encoding the line end to remove any ambiguity.  Systems
that encode line breaks differently from the specified method would be
responsible for scanning the data and performing the necessary substitution.

One additional thing to note: When a file is transferred in Text mode, the
size information reported for a file must be considered to be an estimate as
computing the exact size may consume too many resources or use too much time
to process the command in a timely manner.  The only way to determine that
all of the text from the file has been retrieved is through the receipt of
end of file status when there is a request for data.

----------------------
Richard Whalen
Process Software



> -----Original Message-----
> From: Joseph Galbraith [mailto:galb-list%vandyke.com@localhost]
> Sent: Monday, May 13, 2002 10:53 AM
> To: Wei Dai; denis.bider%denisbider.com@localhost
> Cc: ietf-ssh%netbsd.org@localhost
> Subject: Re: UTF8 in SFTP (was: solving the SFTP text mode issue)
> 
> 
> > Here's another simple suggestion for your consideration. 
> Define a new
> > pflags flag for SSH_FXP_OPEN:
> > 
> >    #define SSH_FXF_TEXT            0x00000040
> > 
> > It would have the following meaning:
> > 
> >    SSH_FXF_TEXT
> >       File SHOULD be encoded as UTF-8 using either CR, LF, 
> or CR/LF to 
> >       indicate line end, or converted to this encoding 
> before transfer. 
> >       File access MUST be sequential.
> 
> First, I don't think SHOULD is strong enough here.
> For this to be useful, it would need to be MUST.
> Probably this text should be included:
> 
>         If the server can not comply, it must respond
>         with status SSH_FX_OP_UNSUPPORTED.
> 
> I don't think I can determine the current encoding
> of an arbitrary text file in order to convert
> it's contents to UTF-8.  (Some files will be
> encoded in unicode, and of course there is no
> problem with those.)
> 
> I believe the unix folks have the same problem
> with file names -- which is why I've been having
> such a hard time getting them to swallow UFT-8
> for file names.
> 
> I actually think that for both cases, it may be
> necessary to make UTF-8 optional, determined
> by the server.
> 
> I.e., the client MUST be able to read filenames and content
> encoded in UTF-8.  The server MAY send filenames and/or
> content in UTF-8.  If content/filenames are not encoded in UTF-8, 
> there encoding is unspecified, and user intervention may
> be required to determine how to display / save the file.
> 
> If the SSH_FXF_TEXT flag is set during open, and the server
> will send file content in UTF-8, it should respond with
> status code SSH_FX_OK_UTF8 (status code 9) instead of
> SSH_FX_OK.
> 
> If the server will encode filenames in UTF-8, it should
> include the following extension data in it's VERSION
> packet (if and only if the clients INIT packet specified
> a version >= 3.)
> 
>   "filename-utf8"        # extension name
>   ""                     # no extension data
> 
> - Joseph
> 
> 



Home | Main Index | Thread Index | Old Index