Subject: Re: Concerns about our NewReno code
To: Bill Studenmund <wrstuden@NetBSD.org>
From: Jonathan Stone <jonathan@dsg.stanford.edu>
List: tech-net
Date: 11/08/2004 11:05:11
Bill,
If you are concerned about TCP performance issues between NetBSD and
{MacOS,FreeBSD-4.4}, then your task must be be to eliminate the
`squashed ACK' bug from consideration. That is, compare and contrast:
a) TCP between NetBSD and FreeBSD 4-4.
b) TCP between NetBSD and FreeBSD-4.5
or some later release than FreeBSD-4.5. If (dim) memory serves, the
changes to fix the so-called `squashed-ACK' bug were posted by Matt
Dillon to FreeBSD-hackers, then committed to FreeBSD's RELENG_4 branch,
circa November 2001. You want the third of a series of 3 commits by
Matt Dillon.)
That said -- without looking at the code, Jeffrey Hsu's description
sounds clean, perhaps cleaner than what we have now. Let's wait and
see what Jason says?
In message <20041108184922.GC20869@netbsd.org>,
Bill Studenmund writes:
>
>--E13BgyNx05feLLmH
>Content-Type: text/plain; charset=us-ascii
>Content-Disposition: inline
>Content-Transfer-Encoding: quoted-printable
>
>Recently, I was helping a customer debug a latency-sensitive application=20
>over TCP, where they were sending a lot of data from MacOS X. The TCP=20
>connection would just stop for over a second. Upon further investigation,=
>=20
>we realized we were seeing problems with multi-packet drop recovery. We=20
>were facing the exact issue that the "NewReno" code was designed to=20
>address.
>
>We then remembered that MacOS 10.3 (and a number of earlier versions) has=
>=20
>(have) a TCP stack taken from FreeBSD 4.4. So we started looking at the=20
>changes made to the FreeBSD 4 TCP stack, and found revision 1.107.2.36 of=
>=20
>their tcp_input.c, which has the following comment:
>
>*****
>Merge from current
> rev 1.170: Cosmetic-only changes for readability.
> rev 1.187: Fix NewReno.
>
>Rev 1.170 was done primarily to expose the shortcomings of the handling
>of t_dupacks field in the old NewReno code. Rev 1.187 replaces the old
>NewReno logic with an implementation which closely follows the letter
>of the spec.
>*****
>
>"Fix NewReno"... So we looked into it, and made patches to Darwin, and it=
>=20
>seemed to work better.
>
>So now here's where this turns into a NetBSD issue. NetBSD's New Reno code=
>=20
>SURE lookes like the code FreeBSD and MacOS had. While the comment in=20
>FreeBSD's cvs is rather vague, hsu@freebsd dot org explained more in a=20
>follow-up ( http://docs.freebsd.org/cgi/getmsg.cgi?fetch=3D557424+0+archive=
>/2003/cvs-all/20030119.cvs-all ):
>
>The state for when we are enter, are in, and leave the NewReno Fast=20
>Recovery
>period has been split out from t_dupacks into its own state variable,=20
>snd_high,
>which has the semantics described in the spec
> RFC2582, NewReno Modification to TCP's Fast Recovery
>for the variable call "send_high". Previously, this state was
>overloaded in the t_dupacks field of the tcpcb. The problem with this
>is a number of conditions which reset t_dupacks such as data flowing
>back the other way, window size changes, and re-ordered acks which
>erroneously kick you out of Fast Recovery mode. The end result
>is the TCP stack often has to wait for a timeout to retransmit, which
>would have been avoided if NewReno was working correctly. Tom Henderson
>has analyzed before and after packet traces and the ones before were very
>sick. Now, we correctly transition into and out of Fast Recovery, do the
>correct window adjustments on partial acks, and retransmit when we should.
>
>In addition, the variable named "send_high" in the spec has been split
>out from snd_recover, in order to make the check for more explicit
>and to detect for sequence wraparound. This new version of the
>NewReno logic implements what the spec calls the Careful variant of Fast
>Retransmit, which is the version recommended by the spec.
>
> Jeffrey
>
>
>
>So does anyone else think we need this change too? I can cobble a diff=20
>together, but maybe someone else'd like to look at this?
>
>Take care,
>
>Bill
>
>--E13BgyNx05feLLmH
>Content-Type: application/pgp-signature
>Content-Disposition: inline
>
>-----BEGIN PGP SIGNATURE-----
>Version: GnuPG v1.2.3 (NetBSD)
>
>iD8DBQFBj7+yWz+3JHUci9cRArpPAJ95RpdtB6h2sdriXHf5QfSgQbfXXQCeIR//
>ilPzV404Y6bGb34iLrswjmU=
>=82/P
>-----END PGP SIGNATURE-----
>
>--E13BgyNx05feLLmH--