Re: Why SFTP performance sucks, and how to fix it

To: Peter Gutmann <pgut001%cs.auckland.ac.nz@localhost>
Subject: Re: Why SFTP performance sucks, and how to fix it
From: Joseph Galbraith <galb-list%vandyke.com@localhost>
Date: Wed, 09 Jul 2003 16:19:51 -0600

Unfortunately this performance handbrake was reinvented in the SSHv2 protocol.
Like Ymodem-g and Zmodem running over modern modems, TCP/IP provides a
reliable, flow-controlled transport layer for the SSH protocol.  SSHv2 however
introduced an additional form of flow control that, like Xmodem, requires the
receiver to Ack each packet before more can be sent (the details aren't quite
as straightforward as this since the SSHv2 specification describes things in
terms of packets and data windows, but effectively it's the Xmodem per-packet
Ack).  Most implementations seem to use packet sizes of 16K or occasionally
32K, with some going as low as 4K.  What this means is that no matter how fast
the link, every (say) 16K the transmission stops for 1 RTT until the other
side has sent its Ack (referred to as a window adjust in SSHv2 terminology).

The ack (a window adjust message) can be sent for any amount of data at
any time.  There is not one-to-one correspondane between adjust messages
and packets.

In addition, if the the window size is greater than the packet size, onecould

easily send acks in a timely fashion such that the remote side should never

run out of window. I.e., if the max packet size on a channel is 16K,but the

windows size is 128K, 8 packets must be sent before the window is exhausted

and the sender must halt and wait for an adjust message. If thereciever sendsadjust messages after receiving 32 K of data, additional window shouldbecomeavailable to the sender before it exhausts its window, and no throttlingshould

ever occur.

In addition to the protocol-level handbrake, the SFTP protocol that runs on
top of SSH contains its own handbrake.  This protocol recommends that reads
and writes consist of no more than 32K of data, even though it's running over
the reliable SSH transport which is in turn running over the reliable TCP/IP
transport.  One common implementation limits SFTP packets to 4K bytes,
resulting in a mere 4% link utilisation in the previously-presented scenario.

Because reads and writes do not have to be synchronous, this is only a'hand-brake'if implementations choose to make it so. In other words, even if mywrites mustconsist of no more than 32K packets, I need not wait for the status ofthat packetbefore sending the next. I'm free to send 32K packets out as fast as Ican (limittedby the channel window, but as stated above, that shouldn't be an issueas longas the server can actually process the data as fast as I send it.) Theresults

of those operations will then come back in their own due time.

Similarly, I can send as many read requests, each for 32K, to the server
as I want.  The server should then process them, streaming data to me
in 32K packets.  If I give it another read request as soon as it completes
one, it should be continually processing 32K read requests until the file
is finished.

Since the packet length must be determined before the packet can be sent,
it might be burdensome for implementations to be required to read huge
chunks of data and send it.  What if I issue a read request for 4 gigabytes?

Sure, I could send the packet header, claiming 4 gig of data and beginreadingdata out of the file in reasonable sized chunks and sending it down thewire.But what if the file is truncated to 2 gig after I've read and sent onegig of

it?  What if the file is on a server and that server fails after 2 gig have
been read.  They are corner cases, but if the protocol is incapable of
handling such things, the protocol is flawed.

However, in order to facilitate writing higher performing SFTP clients,I have

considered that it might make sense to add a second read command to SFTP,

which allowed reads of any size, but which can result in multiple datapackets,

culminating in a final status packet.

I.E.
   Client: SSH_FXP_MULTI_READ: req-id=243, offset=0, bytes=4,000,0000
   Server: SSH_FXP_DATA: req-id=243, 64K
   Server: SSH_FXP_DATA: req-id=243, 64K
   Server: SSH_FXP_DATA: req-id=243, 64K
   ...
   Server: SSH_FXP_DATA: req-id=243, 64K
   Server: SSH_FXP_STATUS: req-id=243, SUCCESS

or
   Server: SSH_FXP_DATA: req-id=243, 64K
   Server: SSH_FXP_DATA: req-id=243, 64K
   Server: SSH_FXP_STATUS: req-id=243, EOF

This would allow a client to request the entire file in a single readoperation

without requiring the server to allocate huge buffers or being vulnerable to
corner cases (such as mid-read truncation.)

The fix for this problem is obvious: Remove the handbrake.  This is no good
reason for the per-packet Ack, and certainly other protocols such as SSHv1 and
SSL/TLS function perfectly without it (the absence of the handbrake in SSHv1
is why SSH FAQs observe that the SSHv1 scp is so much faster than the SSHv2
SFTP, even though SFTP is overall a better design).

I believe others have noted that SSHv1's through-put is throttled by the
slowest channel.

Try the following with SSHv1:  write a simple socket server that never reads

the socket. Write a simple socket client that sends data. Connect theclientto the server through a port forward. Eventually your entire connectionwillstall. You can no longer type at the shell prompt. Other port forwardsno longer pass

data.

Any protocol which can transfer multiple streams of  independant data has
to have a per stream flow control mechanism.  Otherwise, the entire protocol
has to be stalled for the slowest stream.

- Joseph

References:
- Why SFTP performance sucks, and how to fix it
  - From: Peter Gutmann

Prev by Date: RE: Why SFTP performance sucks, and how to fix it
Next by Date: retrying keyex (was: Re: Why SFTP performance sucks, and how to fix it)
Previous by Thread: Re: Why SFTP performance sucks, and how to fix it
Next by Thread: Re: Why SFTP performance sucks, and how to fix it
Indexes:

Home | Main Index | Thread Index | Old Index