IETF-SSH archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Text file type hint proposal for filexfer



Thanks for the feedback.

Niels Möller <nisse%lysator.liu.se@localhost> writes:
> I think a MIME content-type makes perfect sense as an extension, but I
> agree it shouldn't be in the sftp spec proper. Therefore, I think it
> is a little confusing to use the name "CONTENT_TYPE" for the attribute
> you propose. Perhaps something like FILE_MODE, OPEN_MODE, FILE_TYPE
> might work?

Fair point. How about `textmode_hint'? 

(Updated proposal:
<http://www.chiark.greenend.org.uk/ucgi/~jacobn/cvsweb/ssh-filexfer-filetype/draft-ietf-secsh-filexfer-05-plus-filetype.txt.diff?r1=1.3&r2=1.6&f=H>
or <http://tinyurl.com/53sbd>.)


> One other observation:
> 
> If you have this attribute, *and* there is some way of knowing the
> server's line end convention,

Well, there's the rub, really. How do you do that? You can't rely on the
"newline" extension being present, or if it is, corresponding with what
the server will send in binary mode.

Also: I'm not familiar with VMS, but the impression I have is that line
breaks there are not represented by a byte sequence at all, but rather
by something that it's not possible to squirt down an SFTP "binary"
connection. Hence our having an explicit text mode in SFTP in the first
place.

My intended semantics for the flag are purely "it's a good idea to open
this file in FXF_TEXT mode", not "if you open this file in binary mode
you can assume a certain line ending delimiter".

> then it would be possible for a client to use the following strategy:
> 
> Always SFTP_OPEN files in binary mode. Use FSTAT to ask the server if
> it's a text or binary file. Next, figure out what are the server's and
> the client's native line ending convention. Then convert the file as it
> is read.

This proposal sounds like an entirely different approach to that
supported by SFTP currently. If my understanding of the VMS issue is
correct, it runs into trouble with that. Anyway, I think we've argued
this one enough on this WG already, and we should stick with the current
mechanism.

(Are you proposing this to get round the atomicity issue I mentioned?)

> As far as I can see, this simple scheme will work just right for
> servers where your new attribute is fully supported (i.e it is always
> correct, and never returns "unknown"). It will of course always fail
> on servers on operating systems that
> 
>  * differentiate between text and binary files
> 
>  * doesn't provide a reliable way for an application, e.g. the sftp
>    server, to find out if a given file name refers to a text or binary
>    file.
> 
> However, I don't see any way to get things right automatically in this
> case; there's no reliable information anywhere that says definitely if
> the file is text or binary, so it must be guessed, either by the user,
> or by the client or by the server. And whenever that guess is wrong,
> file corruption can be expected.

Of course. I don't think there's anything the SFTP protocol can do to
mitigate the latter possibility. The former (which could be said to
include VMS, if my understanding's correct), we can and should support.

(An SFTP server in the latter situation is of course at liberty to
populate this flag by guessing based on file contents, for instance; I'd
say that was a quality-of-implementation issue and outside the scope of
the spec.)

> Is this problematic case common?

Endemic, I think; probably more the rule than the exception. No OS I use
frequently has the ability to reliably distinguish text and non-text
files (Unix, DOS/Windows). That shouldn't stop us providing the facility
in the protocol.

> To me, it seems so fundamentally broken that I don't think we should
> work very hard to try to fix it.

Agree that we can't magic the information out of nowhere, which is why I
provide UNKNOWN.

Personally, I wasn't too sure about having SFTP deal with text file
semantics in the first place; but given that the WG has decided to do so
(and it is a frequent feature request from users), we should provide a
complete set of operations for our chosen abstraction, so that we are
not the link that prevents reliable transfer of text files. Let's learn
something from FTP...



Home | Main Index | Thread Index | Old Index