IETF-SSH archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: AEAD in ssh
Yes, but this is all related to low-overhead padding. The problem with unencrypted lengths is that they make high-overhead padding more difficult or possibly impossible, also.
Suppose you have a user typing at a terminal. The user types "Q", you have to send SSH_MSG_CHANNEL_DATA containing "Q". You cannot pad that packet more than 255 bytes, so it's going to be small.
If your defense against traffic analysis is that you're stuffing the connection with a max bandwidth of SSH_MSG_IGNORE messages, unencrypted lengths prevent you from effectively hiding whether the traffic is file transfer, or terminal. If the traffic is file transfer, you have to send large SSH_MSG_IGNORE messages, or else the real data packets are going to stick out like a sore thumb. If the traffic is terminal, you have to send small SSH_MSG_IGNORE messages, or else the real typing is going to stick out.
Sure, low-overhead padding doesn't do much. But unencrypted lengths defeat the effectiveness of higher-overhead padding, also.
I haven't actually implemented high-overhead padding, but it's something I've been considering. Trying to combine that with unencrypted lengths in AEAD would be shooting myself in the foot.
----- Original Message -----
From: Peter Gutmann
Sent: Wednesday, February 3, 2016 01:02
To: Mark D. Baushke
Cc: denis bider ; NielsMöller ; Stephen Farrell ; ietf-ssh%NetBSD.org@localhost
Subject: RE: AEAD in ssh
mdb%juniper.net@localhost <mdb%juniper.net@localhost> writes:
>That said, the paper does raise questions about how to effeciently use such
>counermeasures and requires a bit more thinking on my part.
It was a real eye-opener, it points out that the simplistic mechanism built
into both TLS and SSH for (attempting to) defeat traffic analysis pretty much
doesn't work. There's been a bit of earlier work in this area (the work on
Bayesian classifiers, notably used by the Tor folks) but this is a 2nd-gen
study that looks at how to defeat countermeasures that that initial 1st-gen
work inspired.
I agree that there may be issues of applicability to SSH, but a bigger
question is the usual WYTM (what's your threat model), you need to figure out
what you're trying to defend against in order to evaluate your defences. It'd
be nice to have some rigorous model like the random oracle model or standard
model in crypto. The pad-the-packets approach doesn't really have any threat
model except that it seems like a good thing to do, what they've pointed out
is that it's ineffective against traffic classifiers. Hopefully there'll be
more work on this in the future (although the previous papers were from 5-10
years ago)...
The tl;dr version, from a comment in my code:
Padding in order to defeat traffic analysis is extremely problematic. The
simplistic approach of adding random padding provides little more than warm
fuzzies since it falls almost trivially to statistical classifiers like
Bayesian or support vector machines. In fact almost any attempt to defeat
traffic analysis with low overhead, including random padding, linear padding
(padding to the nearest 128 bytes), exponential padding (padding to the
nearest power of two), mice/elephant padding (padding short packets to 128
bytes and long ones to the MTU), straight padding to the MTU, and padding by
a random multiple of 8 or 16 bytes, doesn't work (see "Peek-a-Book, I Still
See You: Why Efficient Traffic Analysis Countermeasures Fail" by Dyer,
Coult, Ristenpart and Shrimpton).
It's only when quite complex traffic morphing, sending fixed-length packets
at fixed intervals and the like, is applied and reaches an overhead of 400%
that things start getting tricky for an attacker, or at least an attacker
using a straightforward Bayesian or SVM classifier.
This doesn't leave much choice in the way of padding, since no matter what
we do it won't be terribly effective. The best option is to choose the
variant with the lowest overhead and use that, since it has at least a small
amount of effect. This is linear padding, but we pad to 64 bytes instead of
128 because we're typically used in embedded environments which both use
shorter messages and often have bandwidth constraints.
Peter.
Home |
Main Index |
Thread Index |
Old Index