NetBSD-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Testing memory performance



OK I disabled NUMA in BIOS, there is a slight performance hit, but
NetBSD is still much slower than Linux. This time I did single thread
test, but disparity grows with number of threads.

NetBSD:
$ ./sv_mem -mode=wr -size=16g -block=1k -threads=1
Thread 1     preflt=11285.07 msec, memcpy=3056.22 MiB/sec
Total transfer rate: 3056.22 MiB/sec

Linux:
$ ./sv_mem -mode=wr -size=16g -block=1k -threads=1
Thread 1     preflt=7319.33 msec, memcpy=5089.21 MiB/sec
Total transfer rate: 5089.21 MiB/sec

Note that to pre-fault (touch 1 byte at every 4 KiB page) 16 GiB of
pages it took NetBSD around 11 seconds, Linux took 7 seconds. With 16
concurrent threads, NetBSD pre-fault is 18 times longer.
Maybe there is a global lock in NetBSD VM subsystem that slows things
down with higher number of threads.

So the average throughput of memcpy is slower on NetBSD with higher
number of threads because they can't make progress until pages are
allocated and a global lock causes contention, so they sit waiting
idle. 

Note below how NetBSD memcpy for individual threads is faster, but the
overall throughput is almost half of Linux, because NetBSD VM subsystem
acts like a barrier and causes those threads to stall until pages are
allocated.


NetBSD:
$ ./sv_mem -mode=wr -size=1g -block=1k -threads=16
Thread 5     preflt=16400.12 msec, memcpy=3130.44 MiB/sec
Thread 11    preflt=16931.65 msec, memcpy=3154.73 MiB/sec
Thread 9     preflt=17169.03 msec, memcpy=2514.06 MiB/sec
Thread 4     preflt=17632.37 msec, memcpy=2928.74 MiB/sec
Thread 14    preflt=17696.83 msec, memcpy=2146.89 MiB/sec
Thread 7     preflt=17885.63 msec, memcpy=2926.97 MiB/sec
Thread 1     preflt=17918.38 msec, memcpy=1338.85 MiB/sec
Thread 10    preflt=18316.65 msec, memcpy=2082.36 MiB/sec
Thread 15    preflt=18323.43 msec, memcpy=1338.62 MiB/sec
Thread 12    preflt=18310.89 msec, memcpy=1322.38 MiB/sec
Thread 6     preflt=18363.57 msec, memcpy=1507.58 MiB/sec
Thread 16    preflt=18360.23 msec, memcpy=1909.12 MiB/sec
Thread 8     preflt=18155.39 msec, memcpy=1478.17 MiB/sec
Thread 13    preflt=18236.67 msec, memcpy=1849.76 MiB/sec
Thread 3     preflt=18303.09 msec, memcpy=2116.50 MiB/sec
Thread 2     preflt=17960.70 msec, memcpy=1325.43 MiB/sec
Total transfer rate: 6087.94 MiB/sec

Linux:
$ ./sv_mem -mode=wr -size=1g -block=1k -threads=16
Thread 13    preflt=1182.27 msec, memcpy=902.88 MiB/sec
Thread 9     preflt=1183.55 msec, memcpy=903.02 MiB/sec
Thread 5     preflt=1191.65 msec, memcpy=899.32 MiB/sec
Thread 11    preflt=1186.96 msec, memcpy=897.64 MiB/sec
Thread 7     preflt=1195.46 msec, memcpy=898.71 MiB/sec
Thread 6     preflt=1207.12 msec, memcpy=904.71 MiB/sec
Thread 15    preflt=1194.18 msec, memcpy=896.05 MiB/sec
Thread 4     preflt=1216.37 msec, memcpy=909.09 MiB/sec
Thread 3     preflt=1210.41 msec, memcpy=897.77 MiB/sec
Thread 2     preflt=1210.36 msec, memcpy=896.36 MiB/sec
Thread 12    preflt=1210.59 msec, memcpy=898.79 MiB/sec
Thread 14    preflt=1209.41 msec, memcpy=898.01 MiB/sec
Thread 10    preflt=1210.00 msec, memcpy=896.88 MiB/sec
Thread 1     preflt=1216.32 msec, memcpy=899.56 MiB/sec
Thread 16    preflt=1209.18 msec, memcpy=899.34 MiB/sec
Thread 8     preflt=1231.36 msec, memcpy=910.00 MiB/sec
Total transfer rate: 13978.88 MiB/sec



Home | Main Index | Thread Index | Old Index