Subject: Re: libkern's rb tree
To: None <tech-kern@netbsd.org>
From: Joerg Sonnenberger <joerg@britannica.bec.de>
List: tech-kern
Date: 10/24/2007 10:37:09
On Mon, Oct 22, 2007 at 06:29:02PM -0700, Jason Thorpe wrote:
>> The first concern is that the code doesn't allow "type safe" usage. I
>> have a set of wrapping code that avoids this by creating inline glue
>> for a very small cost. IMO this is necessary feature before actual
>> wide-spread use.
>
> The lack of type safety doesn't really bother me. This would be no
> different for generic hash table routines, etc. In any case, we're written
> in C, and void * is what we get if we don't want wrappers (which increase
> code size) or code duplication (which increases code size).
As soon as the rb tree entry is not the first field in a structure,
it is a bother.
>> The second concern is the use of the sentinel nodes. Having done the rb
>> tree myself I'm sure that they save very little actual code and might
>> have even increased the size.
>
> Again, sentinel nodes don't bother me. And, as documented in the code,
> using them enables a couple of clever optimizations. Matt designed that rb
> code to be used for a MIPS pmap module, and wanted it to be small and fast,
> and shaving cycles was a primary concern.
Is MIPS a platform where NULL pointer comparisions are more expensive then
testing a bit field in a pointer? Otherwise I am quite sure that this is an
argument that doesn't hold.
>
>> The third concern is generality. The only use of sys/tree.h in the
>> kernel is uvm right now and that does some nasty things with RB_AUGMENT.
>> We can't replace that with the current rb_moved. If we want to have a
>> single copy in the tree, we need a callback whenever a child of a node
>> changes. That adds a conditional to a lot of branches though.
>
> Having a callback could take a separate path (i.e. duplicate some small
> amount of code within rb.c itself). That said, it seems like uvm_map.c
> could compute that information lazily, thus eliminating the need for the
> "augment" altogether, and actually improving performance (you only need to
> know a sub-tree's available space when attempting to allocate from the map,
> so why constantly recompute it when e.g. removing mappings?)
Computing it lazily also destroys boundaries on the computational cost. For the
uvm code, I would call that a non-trivial problem. There's also the problem that
removing an inner node can result in non-continous changes to the tree, which
are hard to work around without forcing a full recompute all the time.
Joerg