Subject: Re: CVS commit: src/sys/lib/libkern
To: None <source-changes@NetBSD.org>
From: Joerg Sonnenberger <joerg@britannica.bec.de>
List: source-changes
Date: 11/22/2007 22:00:40
On Fri, Nov 23, 2007 at 09:29:04AM +1300, David Sainty wrote:
> Joerg Sonnenberger wrote:
> > On Thu, Nov 22, 2007 at 05:00:32PM +1300, David Sainty wrote:
> >
> >> It is also often cheaper to insert, and modify the existing node if
> >> present rather than query before insert, and then repeat the lookup for
> >> the insert operation too. It's also more "atomic friendly" than query +
> >> insert, should the tree be doing its own locking.
> >>
> >
> > I guess the critical question here is: how much overhead do you have in
> > preparing the new node. As soon as you have to allocate memory, just
> > checking for dups first is way cheaper. That's what makes me believe
> > that a simple boolean that tells "I was successful" is good enough.
> >
> Fair point if the allocation is via malloc(), though it still depends on
> the cost of node comparisons - which may be high in itself.
>
> But I have a couple of counterpoints :)
>
> If the node was allocated from a pool then the memory allocation cost
> would be near O(1), as opposed to the (something like) O(lg(n)) cost of
> the initial duplicate check.
>
> Aside from cost, even in the error case (where the insert is really
> never expected to fail) it would be nice to be able to report more
> detail on the collision should it happen.
O(lg(n)) is something like 40 node comparisions for 1000,000 nodes. But
let's not start bean counting. My assumption is that code using the red
black tree either assumes that the node already exists and does a look
up already. The critical path is successful find for that. This is the
normal case for a hot cache. The other major usage is insert new
elements and assume that a failed insertion is the error path, e.g.
not the critical path. In that case an additional look up for
diagnostics etc is cheap as well (cache hot anyway).
More important for me is the API question. I prefer truth values in the
natural order and checking for NULL either violates that or is more
verbose for the explicit conversion.
But we don't have to have a long discussion, I can understand your
reasons as well.
Joerg