Subject: Re: Thread benchmarks - FreeBSD corrections
To: None <tech-kern@NetBSD.org, current-users@netbsd.org>
From: Andrew Doran <ad@netbsd.org>
List: tech-kern
Date: 10/03/2007 15:08:30
On Tue, Oct 02, 2007 at 12:26:55AM +0200, Kris Kennaway wrote:

> I tested on a quad 500 MHz p3 (i.e. 30% slower clock speed than your 
> system), via 100Mbps em0.  Performance was already at the level of the 
> FreeBSD curve on your graph (about 320 tps across a range of loads), and 
> if I scale up by 700/500 then it's about the same as your NetBSD curve. 
>  I suspect that this will actually underestimate performance a bit 
> because the CPU is an older generation than yours, so the difference is 
> not just clock speed.  One thing that is kind of interesting is that 
> some of the locking optimizations that we have not yet committed don't 
> make a difference on this machine and workload, apparently they are only 
> important at 8 CPUs and above.
> 
> Anyway, this all suggests to me that something is going wrong on your 
> system, so if the above doesn't help then we'll have to look closer. 
> One other possibility is that your NIC may be misbehaving.

It turns out that this was due to debugging in malloc(). As suggested I
recompiled FreeBSD's libc without the debugging, and FreeBSD's performance
is much better: as of right now, NetBSD and FreeBSD are fairly closely
matched on my 4 way system. From two single runs with both NetBSD and
FreeBSD using SCHED_4BSD:

	http://www.netbsd.org/~ad/sysbench/sysbench-4bsd.png

Here with SCHED_ULE and with NetBSD using Mindaugas' experimental scheduler.
Like ULE, it uses per-CPU run queues. Among other things that means threads
tend to migrate less.

	http://www.netbsd.org/~ad/sysbench/sysbench-pcpu.png

Thanks,
Andrew