IETF-SSH archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Text file type hint proposal for filexfer



Ahhh... do you actually open the file differently at
the OS level for text mode?  That hadn't occured to
me.

Still, for VMS, this proposal isn't any worse than the
previous one, and this proposal is a little more compact.
VMS would not set the TEXT_HINT bit in it's
supported-attrib field, and clients would know they can't
switch into text-mode.

I'm not sure atomicity is a really that big deal...
maybe I should just bag the switch to text mode
anyway.

I do still prefer adding a couple of bits to the
attrib-bits field to adding a whole new field.

Thanks,

Joseph


Richard Whalen wrote:
I don't think that this scheme will work for VMS.
Though we can report on the format of the file, the idea of opening it in an
unspecified mode, then specifying that you really want text mode may not be
possible without sacrificing the atomicity that some want.

-----Original Message-----
From: Joseph Galbraith [mailto:galb-list%vandyke.com@localhost]
Sent: Tuesday, October 05, 2004 3:21 PM
To: ietf-ssh%netbsd.org@localhost
Subject: Re: Text file type hint proposal for filexfer


How about this alternative-- sftpv5 has (and naturally v6 will have)
an 'attrib-bits' field in the attrib structure which describes
various bit attributes, such as case-sensitivity, if the file
is encrypted on disk, compressed on disk, advisory readonly,
a hidden file, a system file, etc.

What if we add two bits there:

     #define SSH_FILEXFER_ATTR_FLAGS_TEXT_HINT      0x00000800

        The server is reasonably certain that the file
        contains textual data, and therefore the file
        should be opened in text mode.

        This flag MUST NOT be present during a setstat operation.
        If this flag is present during an fsetstat operation,
        the file handle is converted to a text-mode handle, as
        if it had been opened with SSH_FXF_ACCESS_TEXT_MODE.

        The server MUST NOT set this bit unless it is reasonably
        certain the file should be transfered in text-mode because
        many clients will use this flag to initiate automatic
        text-mode translation.  If the server sets this flag
        in error, data corruption will result.

     #define SSH_FILEXFER_ATTR_FLAGS_BIN_HINT       0x00001000
        The server is reasonably certain that the file
        contents are binary and should not be opened
        with SSH_FXF_ACCESS_TEXT_MODE.

This gets around your atomicity issue (I think.)  The client
can open sans SSH_FXF_ACCESS_TEXT_MODE, do an fstat, and then,
if the server comes back with SSH_FILEXFER_ATTR_FLAGS_TEXT_HINT
in the attrib.attrib-bits, it can do a fsetstat to put the
handle into text mode.

mime types have come up enough times that I think I'm
going to add a field for them; anybody that doesn't
want shouldn't need to implement them though.

(I know of at least one filesystem that does know the
mime-type information though.)

- Joseph

Jacob Nevins wrote:

SFTP v4 (draft-ietf-secsh-filexfer-04) added the FXF_TEXT flag to OPEN
to allow the client to request that the server open a file in text
mode.

However, no means is provided for the server to advise the client on
whether this transfer mode is appropriate. In all cases it is likely
up to the user to manually specify which mode to use.

In FTP, this has tended to lead to corrupted file transfers when
unintentional translation took place, etc. Some FTP implementations
have performed "automatic" conversions, e.g., using heuristics based
on file contents; this is not reliable, and anyway rather defeats the
point of having the server translate to a known representation.

The following proposal allows the server to optionally communicate
this information to the client via attributes. (Since it's an
attribute, the client can also manipulate it on the server as far as
is specified here if it corresponds to state there.)

(This proposal is against draft-ietf-secsh-filexfer-05.)

To section 5 "File Attributes", 2nd para, add between `attrib-bits' and
`extended_count':

      byte     content_type         present only if flag CONTENT_TYPE

To section 5.1 "Flags", add:

      #define SSH_FILEXFER_ATTR_CONTENT_TYPE      0x00000400

Add new section after section 5.8 "attrib-bits":

5.x Content type

  The `content_type' field, if present, indicates whether the file is
  known to be a text file, known _not_ to be a text file, or is of
  unknown content.  When sent from server to client, it acts as a
  hint to the client as to whether a file should be opened in
  SSH_FXF_TEXT mode.  The following values are defined:

       #define SSH_FILEXFER_CTYPE_UNKNOWN         0
       #define SSH_FILEXFER_CTYPE_BINARY          1
       #define SSH_FILEXFER_CTYPE_TEXT            2

To the end of the description of SSH_FXF_TEXT in section 6.3.1
"Opening a File", add:

     Clients MAY decide whether to use SSH_FXF_TEXT based on a
     previously seen `content_type' attribute; see Section 5.x.

(Of course, this could all be easily reformulated as an extension
attribute if desired.)

A version of filexfer-05 marked up with this proposal can be found at


<http://www.chiark.greenend.org.uk/ucgi/~jacobn/cvsweb/ssh-filexfer-filetype
/draft-ietf-secsh-filexfer-05-plus-filetype.txt.diff?r1=1.3&r2=1.4&f=H>

or <http://tinyurl.com/4cbfz>.

Warts with this proposal:

It's unfortunate that this can't be made atomic within the existing
protocol design; it requires participating clients to do a STAT or
similar before OPEN.
(In fact, perhaps there should be some guidance on which of STAT or
LSTAT to do in this case.)
I don't know whether the non-atomicity would be a problem in practice.

It's perhaps tempting to use the IETF's usual content-type notation,
"MIME types" (RFCs 2045-9 and related standards). However:
- I don't know of any real filesystems which use it.
- Consider "multipart/mixed" and friends.
- Does everything in "text/*" want to be opened FXF_TEXT?
- Does everything outside "text/*" _not_ want to be opened FXF_TEXT?
  (Excluding composite types like multipart.)
- It risks adding further to SFTP's bloat if not specified carefully.
  We don't want to end up accidentally requiring implementations to
  contain entire MIME implementations (or more likely, ill-defined
  subsets thereof with poor interoperability).







Home | Main Index | Thread Index | Old Index