IETF-SSH archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

RE: Why SFTP performance sucks, and how to fix it



Peter,

I totally agree that the SSH protocol in some places is more complicated
than it needs to be; however, the protocol does not impede performance
if properly implemented. According to our experience, it is quite
possible to develop a fast SSH implementation if only one invests some
thought and care into its design. The ssh.com implementation as well as
ours are proof that speeds of several MB per second can be obtained. 

Further, there are now dozens of implementations that would be broken by
your suggestions unnecessarily. While your proposals would make it
easier for a naive developer to put together a fast implementation, they
remove functionality that makes the protocol complete - making a so-so
protocol out of a sophisticated one. 

The protocol provides you with the means to control the flow of the
session, but it doesn't tell you how to do it efficiently. If you want
to resolve this, the solution is to write a document describing how to
use the control elements for best performance; not to remove them
altogether. The protocol is fine.

Regards,

denis


> -----Original Message-----
> From: ietf-ssh-owner%netbsd.org@localhost 
> [mailto:ietf-ssh-owner%netbsd.org@localhost] On Behalf Of Peter Gutmann
> Sent: Tuesday, July 08, 2003 06:01
> To: ietf-ssh%netbsd.org@localhost
> Subject: Why SFTP performance sucks, and how to fix it
> 
> 
> Now that I've got your attention... :-).
> 
> The following is a section of the (not-yet-published) paper 
> "Performance
> Characteristics of Application-level Security Protocols", 
> which looks at,
> well, performance characteristics of application-level 
> security protocols.
> One of these is SSH (the rest are SSL, PGP, and S/MIME, in 
> case anyone cares).
> The paper hasn't been published yet (it's still a work in 
> progress), since it
> probably won't be published for awhile (I'm probably 
> submitting it to Usenix
> Security next year) and since the information in this section 
> could be useful
> to SSH developers, I'm posting it here.  If people find it of 
> any use, I might
> later post it to one or two general crypto lists to let other crypto
> developers know, since it describes the SFTP performance 
> problem and how to
> fix it.
> 
> -- Snip --
> 
> 6.3 The SSHv2 and SFTP Performance Handbrake
> 
> In 1977, Ward Christensen created the Xmodem data transfer protocol
> [Christensen 1977].  Coming in an era of 300bps modems and 
> unreliable links,
> Xmodem divided data into 128-byte packets and required an Ack 
> to be sent for
> each packet before the next one could be transmitted.  As 
> modems became faster
> and links more reliable, the need to Ack each Xmodem packet 
> became more and
> more of a performance handbrake, since no matter how fast or 
> reliable the
> link, no more than 128 bytes of data could be sent without 
> waiting 1 RTT for
> the Ack.  The solution to the problem was to increase the 
> packet size (Ymodem,
> Xmodem-1K), and drop the requirement to Ack each packet 
> (Ymodem-g, Zmodem)
> [Forsberg 1988].  The latter was perfectly acceptable, since 
> by then modems
> included their own error correction and flow control mechanism.
> 
> Unfortunately this performance handbrake was reinvented in 
> the SSHv2 protocol.
> Like Ymodem-g and Zmodem running over modern modems, TCP/IP provides a
> reliable, flow-controlled transport layer for the SSH 
> protocol.  SSHv2 however
> introduced an additional form of flow control that, like 
> Xmodem, requires the
> receiver to Ack each packet before more can be sent (the 
> details aren't quite
> as straightforward as this since the SSHv2 specification 
> describes things in
> terms of packets and data windows, but effectively it's the 
> Xmodem per-packet
> Ack).  Most implementations seem to use packet sizes of 16K 
> or occasionally
> 32K, with some going as low as 4K.  What this means is that 
> no matter how fast
> the link, every (say) 16K the transmission stops for 1 RTT 
> until the other
> side has sent its Ack (referred to as a window adjust in 
> SSHv2 terminology).
> Consider for example the effect of this on a T1 international 
> link with a
> half-second RTT.  With the handbrake in operation, the link 
> can run at only
> 17% of its total capacity.  This performance hit is so 
> noticeable that it is
> mentioned in the FAQs of some SSH implementations [PuttyFAQ].
> 
> In addition to the protocol-level handbrake, the SFTP 
> protocol that runs on
> top of SSH contains its own handbrake.  This protocol 
> recommends that reads
> and writes consist of no more than 32K of data, even though 
> it's running over
> the reliable SSH transport which is in turn running over the 
> reliable TCP/IP
> transport.  One common implementation limits SFTP packets to 4K bytes,
> resulting in a mere 4% link utilisation in the 
> previously-presented scenario.
> 
> The fix for this problem is obvious: Remove the handbrake.  
> This is no good
> reason for the per-packet Ack, and certainly other protocols 
> such as SSHv1 and
> SSL/TLS function perfectly without it (the absence of the 
> handbrake in SSHv1
> is why SSH FAQs observe that the SSHv1 scp is so much faster 
> than the SSHv2
> SFTP, even though SFTP is overall a better design).  The 
> effect of running
> without the handbrake on were investigated using cryptlib 
> with a fairly
> rudimentary implementation of SFTP running over the built-in SSHv2.
> cryptlib's SSH implementation has always set the window size 
> to INT_MAX (some
> implementations have problems with UINT_MAX as the window size), which
> effectively disables the SSH-level handbrake.  The SFTP implementation
> followed suit, requesting a read/write of the entire file at 
> once rather than
> breaking it up into little packets at the SFTP level 
> (packetisation is already
> handled at the SSH and TCP/IP layers).  Run over an 
> international link (pretty
> much a given when you're in New Zealand), this SFTP 
> implementation was around
> five times faster than the Putty implementation of SFTP 
> talking to OpenSSH,
> which sends data in 4K SFTP packets and (by extension) 4K SSH 
> packets.  Even
> over a low-latency link, the difference was impressive: 
> cryptlib was an order
> of magnitude faster than Putty on the loopback interface 
> (latency being
> relative in this case).
> 
> The SSH-level handbrake can therefore be provisionally 
> removed by having
> implementations set the window size to INT_MAX, and 
> permanently removed by
> deprecating the Ack/window-based flow control and perhaps 
> optionally providing
> Xon/Xoff-style flow control if absolutely necessary (as was 
> mentioned earlier,
> both SSHv1 and SSL/TLS function fine without requiring this). 
>  The SFTP-level
> handbrake can be removed by eliminating the maximum 
> packet-size wording of the
> SFTP specification, and recommending that implementations 
> read and write all
> data at once rather than engaging in additional redundant 
> packetisation at the
> SFTP level.
> 
> Most of this can be effected through a simple code change, however
> implementors should be aware that many implementations will 
> still stop and
> wait for an Ack after a certain amount of data has been 
> transmitted, even with
> an effectively infinite-sized windows.  On the sender side 
> things aren't quite
> so bad, experimentation has shown that it's fairly safe to ignore the
> receiver's window size and send data at the maximum rate 
> possible, discarding
> any window adjusts that arrive (the only slight complication 
> is that it's
> occasionally necessary to stop sending for a moment and clear 
> the read channel
> of the accumulation of Acks that have arrived while sending). 
>  Since most
> implementations include a facility for checking the peer's 
> software version to
> identify and work around implementation bugs, detecting 
> pre-handbrake-fix
> implementations and providing the appropriate slower 
> interpretation of the
> protocol should be relatively straightforward.  In addition, 
> FAQs about the
> poor performance of SFTP will need to be updated.
> 
> [Christensen 1977] "MODEM.ASM", Ward Christensen, August 1977 
> (the Xmodem
> protocol was defined in terms of "What this program does" 
> rather than being
> formally documented, the author described it in a Compuserve 
> post some years
> later as "a quick hack I threw together").
> 
> [Forsberg 1988] "Xmodem/Ymodem Protocol Reference: A 
> compendium of documents
> describing the Xmodem and Ymodem File Transfer Protocols", 
> Chuck Forsberg,
> October 1988.
> 
> [PuttyFAQ] "PuTTY FAQ", Simon Tatham, 2003,
> http://www.chiark.greenend.org.uk/~sgtatham/putty/faq.html, 
> question A.6.8,
> "PSFTP transfers files much slower than PSCP".
> 
> -- Snip --
> 
> While I'm pointing out things that should be fixed in the 
> spec, my other big
> gripe is the way the initial message is handled.  Currently 
> the spec describes
> a rather messy mechanism where both sides start by shouting 
> at each other and
> then engage in a complex dance to sort out what's what afterwards (the
> "guessing" stuff).  This leads to really messy 
> implementations when one of the
> partners doesn't get the dance steps right.
> 
> There is no good reason for this complication in the 
> protocol.  I don't buy
> the RTT argument given in the SSH-transport draft, the 
> guessing stuff saves
> one whole RTT, but then the incredibly chatty authentication 
> protocol ("Would
> you like to authenticate then?" - "Yes I'd like to 
> authenticate" - "How would
> you like to authenticate?" - "Well, would the following suit 
> you?" - "That
> looks about right, let's do it" - "Right, I'm about to start" 
> - etc etc etc)
> more than makes up for any miniscule savings during the 
> initial handshake.
> 
> The way to fix this is simple: Replace all the guessing stuff 
> and the complex
> rules that go with it with:
> 
>   Key exchange begins by each side sending lists of supported 
> algorithms.  The
>   server sends its list of supported algorithms first, the 
> client chooses
>   which ones it prefers that it also supports and sends back 
> its choice in the
>   reply.
> 
> That removes all of the handshake-dance complexity, and 
> vastly simplifies
> implementations.
> 
> Oh yes, in case anyone finds the SFTP info above useful and 
> wants to reference
> it for some reason, please cite it as '"Performance Characteristics of
> Application-level Security Protocols", Peter Gutmann, to 
> appear', since it's
> not officially published yet.
> 
> Peter.
> 




Home | Main Index | Thread Index | Old Index