Re: Why SFTP performance sucks, and how to fix it

To: Peter Gutmann <pgut001%cs.auckland.ac.nz@localhost>
Subject: Re: Why SFTP performance sucks, and how to fix it
From: Martin Pool <mbp%samba.org@localhost>
Date: Tue, 8 Jul 2003 15:38:48 +1000

On  8 Jul 2003, Peter Gutmann <pgut001%cs.auckland.ac.nz@localhost> wrote:
> Martin Pool <mbp%samba.org@localhost> writes:
> 
> >Of course I agree that 4k or even 32k is too small for most modern networks.
> >But I did want to point out that going too far in the opposite direction can
> >be a problem too.
> 
> Fair enough.  I guess the XON/XOFF thing (to replace the Ack) would do this,
> or perhaps a channel close depending on the seriousness of the
> error.

Channel close is a terrible error handling method.  To start with, it
gives no indication of what in particular went wrong.

>  I think it'd need some experimentation/going through usage cases to
> see where/how it's useful:
> 
> - Typical usage: SSHv1 and SSL/TLS don't seem to need any flow control, which
>   would imply that in most cases you don't have to worry about it.
> 

> - Out of disk space: Probably a channel close, since it's a fatal error and
>   you want the sender to stop permanently ("Please wait while the sysadmin
>   hot-plugs some more RAID storage" probably won't work :-).

Well, out of disk space was only an example of the kind of error that
can occur at any point.  There are others.  A more common one for
rsync is some kind of permission problem.  SECSH, unlike rsync, can
probably trap that in the OPEN operation before starting to send bulk
data, but the general problem remains.

User quotas are another quite realistic possibility where an
interactive user might very well want to stay connected and delete
some files. 

> - Temporary resource problem (can't think of a good example at the moment but
>   I'm sure there's something): Send XOFF, perhaps with an optional timeout
>   indication, if the sender doesn't get an XON in that time they can consider
>   it a long-term/fatal error as above.
> 
> Another thing to keep in mind here is that if you signal a read/write of the
> entire file at once, the receiver knows how much disk space it needs and can
> lock the space before starting the receive

I don't know of a means for a Unix or Windows server to "lock" disk
space in that way.  I suppose the server might zero-fill the blocks
between getting the start of the request and reading the bulk of it,
but that seems highly contrived.

> I'd really prefer to know before I start that a transfer is going to
> fail, rather than write 2GB and then get the out-of-disk error.

The general problem I'm talking about here is that in a protocol like
SECSH, the size of the request blocks is the amount of data "at risk"
at any point: if something goes wrong, as it unavoidably may, then you
might have wasted your time and money transmitting it.

Here's another example, taken from experience with Samba: if you're
going to send chunks of 2GB at a time they presumably have to be
"streamed" from disk and not built in a temporary buffer.  (Indeed,
people may well want to transfer files larger than their VM - 6GB
files on a 32bit machine.)  There are unavoidable errors where you can
in fact send less data than you originally thought, if e.g. somebody
else truncates the file while you're reading it.  That means that the
header of the request ("write 6G") is impossible to fulfil, so you
need to either drop the connection or write nulls, both of which are
deeply undesirable.

I suspect that sending a block-at-a-time can give good performance if
you choose a good block size and pipeline transmissions, and if
problems with the underlying transport are addressed.  Perhaps IOs
should be roughly proportional to the amount of data in flight?  I
think the request/response model in SECSH is a great thing for
simplicity and robustness and it shouldn't be lightly cast aside.

> If you follow this stragegy you never need to signal out-of-disk during a
> transfer, only at the start.  Even better would be to require (or at least
> recommend) that clients include the file size in the FXP_OPEN *before* any
> data transfer is about to take place, so the receiver can respond to the open
> request with not-enough-disk-space error.  This is perfectly feasible in most
> cases where SFTP is used, since you're sending files of a known size (I
> haven't played with this too much, but it doesn't appear that implementations
> indicate the size at open much, which would require a code update).  You can
> really optimise the transfer management by following a few simple rules in
> which the sender transmits information required to ease data processing in
> advance.
> 
> If it's of any use to people, I could write a small informational appendix or
> whatever for the draft indicating how to use the protocol in a manner that
> makes data transfer management easier.
> 
> Peter.
-- 
Martin

Follow-Ups:
- Re: Why SFTP performance sucks, and how to fix it
  - From: Nicolas Williams
- RE: Why SFTP performance sucks, and how to fix it
  - From: Howard Chu

References:
- Re: Why SFTP performance sucks, and how to fix it
  - From: Peter Gutmann

Prev by Date: Re: Why SFTP performance sucks, and how to fix it
Next by Date: Re: Why SFTP performance sucks, and how to fix it
Previous by Thread: Re: Why SFTP performance sucks, and how to fix it
Next by Thread: RE: Why SFTP performance sucks, and how to fix it
Indexes:

Home | Main Index | Thread Index | Old Index