IETF-SSH archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Why SFTP performance sucks, and how to fix it




Unfortunately this performance handbrake was reinvented in the SSHv2 protocol.
Like Ymodem-g and Zmodem running over modern modems, TCP/IP provides a
reliable, flow-controlled transport layer for the SSH protocol.  SSHv2 however
introduced an additional form of flow control that, like Xmodem, requires the
receiver to Ack each packet before more can be sent (the details aren't quite
as straightforward as this since the SSHv2 specification describes things in
terms of packets and data windows, but effectively it's the Xmodem per-packet
Ack).  Most implementations seem to use packet sizes of 16K or occasionally
32K, with some going as low as 4K.  What this means is that no matter how fast
the link, every (say) 16K the transmission stops for 1 RTT until the other
side has sent its Ack (referred to as a window adjust in SSHv2 terminology).

The ack (a window adjust message) can be sent for any amount of data at
any time.  There is not one-to-one correspondane between adjust messages
and packets.

In addition, if the the window size is greater than the packet size, one could
easily send acks in a timely fashion such that the remote side should never
run out of window. I.e., if the max packet size on a channel is 16K, but the
windows size is 128K, 8 packets must be sent before the window is exhausted
and the sender must halt and wait for an adjust message. If the reciever sends adjust messages after receiving 32 K of data, additional window should become available to the sender before it exhausts its window, and no throttling should
ever occur.

In addition to the protocol-level handbrake, the SFTP protocol that runs on
top of SSH contains its own handbrake.  This protocol recommends that reads
and writes consist of no more than 32K of data, even though it's running over
the reliable SSH transport which is in turn running over the reliable TCP/IP
transport.  One common implementation limits SFTP packets to 4K bytes,
resulting in a mere 4% link utilisation in the previously-presented scenario.

Because reads and writes do not have to be synchronous, this is only a 'hand-brake' if implementations choose to make it so. In other words, even if my writes must consist of no more than 32K packets, I need not wait for the status of that packet before sending the next. I'm free to send 32K packets out as fast as I can (limitted by the channel window, but as stated above, that shouldn't be an issue as long as the server can actually process the data as fast as I send it.) The results
of those operations will then come back in their own due time.

Similarly, I can send as many read requests, each for 32K, to the server
as I want.  The server should then process them, streaming data to me
in 32K packets.  If I give it another read request as soon as it completes
one, it should be continually processing 32K read requests until the file
is finished.

Since the packet length must be determined before the packet can be sent,
it might be burdensome for implementations to be required to read huge
chunks of data and send it.  What if I issue a read request for 4 gigabytes?

Sure, I could send the packet header, claiming 4 gig of data and begin reading data out of the file in reasonable sized chunks and sending it down the wire. But what if the file is truncated to 2 gig after I've read and sent one gig of
it?  What if the file is on a server and that server fails after 2 gig have
been read.  They are corner cases, but if the protocol is incapable of
handling such things, the protocol is flawed.

However, in order to facilitate writing higher performing SFTP clients, I have
considered that it might make sense to add a second read command to SFTP,
which allowed reads of any size, but which can result in multiple data packets,
culminating in a final status packet.

I.E.
   Client: SSH_FXP_MULTI_READ: req-id=243, offset=0, bytes=4,000,0000
   Server: SSH_FXP_DATA: req-id=243, 64K
   Server: SSH_FXP_DATA: req-id=243, 64K
   Server: SSH_FXP_DATA: req-id=243, 64K
   ...
   Server: SSH_FXP_DATA: req-id=243, 64K
   Server: SSH_FXP_STATUS: req-id=243, SUCCESS

or
   Server: SSH_FXP_DATA: req-id=243, 64K
   Server: SSH_FXP_DATA: req-id=243, 64K
   Server: SSH_FXP_STATUS: req-id=243, EOF

This would allow a client to request the entire file in a single read operation
without requiring the server to allocate huge buffers or being vulnerable to
corner cases (such as mid-read truncation.)

The fix for this problem is obvious: Remove the handbrake.  This is no good
reason for the per-packet Ack, and certainly other protocols such as SSHv1 and
SSL/TLS function perfectly without it (the absence of the handbrake in SSHv1
is why SSH FAQs observe that the SSHv1 scp is so much faster than the SSHv2
SFTP, even though SFTP is overall a better design).

I believe others have noted that SSHv1's through-put is throttled by the
slowest channel.

Try the following with SSHv1:  write a simple socket server that never reads
the socket. Write a simple socket client that sends data. Connect the client to the server through a port forward. Eventually your entire connection will stall. You can no longer type at the shell prompt. Other port forwards no longer pass
data.

Any protocol which can transfer multiple streams of  independant data has
to have a per stream flow control mechanism.  Otherwise, the entire protocol
has to be stalled for the slowest stream.

- Joseph





Home | Main Index | Thread Index | Old Index