Subject: Re: Add "last record" and "last mbuf" pointers to sockbuf
To: Jason R Thorpe <thorpej@wasabisystems.com>
From: Jonathan Stone <jonathan@DSG.Stanford.EDU>
List: tech-kern
Date: 07/03/2002 14:02:39
In message <20020701143001.E3496@dr-evil.shagadelic.org>Jason R Thorpe writes:
>
>...going back to a conversation I started a few months ago...
>
>I am planning on adding "last record" and "last mbuf" pointers to
>the sockbuf structure. This makes insertion of new data into the
>socket buffer O(C) rather than O(N). When you're using large socket
>buffers, this really makes a difference.
A few points, in no particular order...
* Great stuff. Let's do it.
* Yes, kttcp lets you focus on just the socket/mbuf-internal portions.
However, if typical usage *does* go to userspace, kttcp
will overstate the relative (percentage), but not absolute
(seconds/milliseconds of walltime) speedup. I'm sure you know
this. (I'm mostly wondering if the 30% speedup on a DP83820
you mentioned yesterday included [kernel<->userspace overhead, or not.)
* Some of the the names don't fit the pre-existing mbuf code, e.g.,
sbappendrecord() vs. sbappend_stream()?
I think sbappendstream() is a better fit.
The SB_UPDATE_TAIL() macro gets used a lot, and what it
does wasn't immediatley clear. I'd find the patch much easier
to read/follow if this macro-name included some hint that it just
checks for a null socketqueue; and if so, nulls the tail pointer.
I dunno, SB_EMPTY_FIXUP() or something similar?
* Your dgram bugs: For a dgram socket, the "tail" pointer
wants to point to the head of the last *chain* (not the last mbuf
in the chain). For a pure stream socket, it wants
to point to the last mbuf in the last chain. [I need more
coffee to remember how streams with record boundaries (TP4) work].
I'll hazard a guess that your bugs are here, somewhere.
I promise to read the patch over the weekend.
* is it worth doing a TCP `fast path' in sosend?.
The historical motivation was perverse apps which did
large numbers of small (3-byte) write()s on TCP sockets.
It's a natural fit with keeping an explicit tail pointer.
(IIRC there were certain workloads/ benchmarks where it made
a signficant win. However the motivating apps/benchmarks may be long-dead).
* Again: great stuff. Let's iron the bugs out and do it.