NetBSD-Users archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: Testing memory performance
OK I disabled NUMA in BIOS, there is a slight performance hit, but
NetBSD is still much slower than Linux. This time I did single thread
test, but disparity grows with number of threads.
NetBSD:
$ ./sv_mem -mode=wr -size=16g -block=1k -threads=1
Thread 1 preflt=11285.07 msec, memcpy=3056.22 MiB/sec
Total transfer rate: 3056.22 MiB/sec
Linux:
$ ./sv_mem -mode=wr -size=16g -block=1k -threads=1
Thread 1 preflt=7319.33 msec, memcpy=5089.21 MiB/sec
Total transfer rate: 5089.21 MiB/sec
Note that to pre-fault (touch 1 byte at every 4 KiB page) 16 GiB of
pages it took NetBSD around 11 seconds, Linux took 7 seconds. With 16
concurrent threads, NetBSD pre-fault is 18 times longer.
Maybe there is a global lock in NetBSD VM subsystem that slows things
down with higher number of threads.
So the average throughput of memcpy is slower on NetBSD with higher
number of threads because they can't make progress until pages are
allocated and a global lock causes contention, so they sit waiting
idle.
Note below how NetBSD memcpy for individual threads is faster, but the
overall throughput is almost half of Linux, because NetBSD VM subsystem
acts like a barrier and causes those threads to stall until pages are
allocated.
NetBSD:
$ ./sv_mem -mode=wr -size=1g -block=1k -threads=16
Thread 5 preflt=16400.12 msec, memcpy=3130.44 MiB/sec
Thread 11 preflt=16931.65 msec, memcpy=3154.73 MiB/sec
Thread 9 preflt=17169.03 msec, memcpy=2514.06 MiB/sec
Thread 4 preflt=17632.37 msec, memcpy=2928.74 MiB/sec
Thread 14 preflt=17696.83 msec, memcpy=2146.89 MiB/sec
Thread 7 preflt=17885.63 msec, memcpy=2926.97 MiB/sec
Thread 1 preflt=17918.38 msec, memcpy=1338.85 MiB/sec
Thread 10 preflt=18316.65 msec, memcpy=2082.36 MiB/sec
Thread 15 preflt=18323.43 msec, memcpy=1338.62 MiB/sec
Thread 12 preflt=18310.89 msec, memcpy=1322.38 MiB/sec
Thread 6 preflt=18363.57 msec, memcpy=1507.58 MiB/sec
Thread 16 preflt=18360.23 msec, memcpy=1909.12 MiB/sec
Thread 8 preflt=18155.39 msec, memcpy=1478.17 MiB/sec
Thread 13 preflt=18236.67 msec, memcpy=1849.76 MiB/sec
Thread 3 preflt=18303.09 msec, memcpy=2116.50 MiB/sec
Thread 2 preflt=17960.70 msec, memcpy=1325.43 MiB/sec
Total transfer rate: 6087.94 MiB/sec
Linux:
$ ./sv_mem -mode=wr -size=1g -block=1k -threads=16
Thread 13 preflt=1182.27 msec, memcpy=902.88 MiB/sec
Thread 9 preflt=1183.55 msec, memcpy=903.02 MiB/sec
Thread 5 preflt=1191.65 msec, memcpy=899.32 MiB/sec
Thread 11 preflt=1186.96 msec, memcpy=897.64 MiB/sec
Thread 7 preflt=1195.46 msec, memcpy=898.71 MiB/sec
Thread 6 preflt=1207.12 msec, memcpy=904.71 MiB/sec
Thread 15 preflt=1194.18 msec, memcpy=896.05 MiB/sec
Thread 4 preflt=1216.37 msec, memcpy=909.09 MiB/sec
Thread 3 preflt=1210.41 msec, memcpy=897.77 MiB/sec
Thread 2 preflt=1210.36 msec, memcpy=896.36 MiB/sec
Thread 12 preflt=1210.59 msec, memcpy=898.79 MiB/sec
Thread 14 preflt=1209.41 msec, memcpy=898.01 MiB/sec
Thread 10 preflt=1210.00 msec, memcpy=896.88 MiB/sec
Thread 1 preflt=1216.32 msec, memcpy=899.56 MiB/sec
Thread 16 preflt=1209.18 msec, memcpy=899.34 MiB/sec
Thread 8 preflt=1231.36 msec, memcpy=910.00 MiB/sec
Total transfer rate: 13978.88 MiB/sec
Home |
Main Index |
Thread Index |
Old Index