Subject: Re: MI SONIC Ethernet driver for mac68k
To: None <hauke@Espresso.Rhein-Neckar.DE>
From: Izumi Tsutsui <tsutsui@ceres.dti.ne.jp>
List: port-mac68k
Date: 06/05/2007 23:15:43
hauke@Espresso.Rhein-Neckar.DE wrote:

> >> Do you have a performance comparison for the old vs. the MI one?
> >
> >Unfortunately, MI one is slower (currently).
> 
> Can you time the transfers from the other (I assume, non-mac68k) machine
> for comparison?

The other side is NetBSD/i386 (Athlon64) connected via re(4)
and a Gig switch.

I've tried the similar tests with more recent (today) sources
with my esp(4) fix, then the MI one gets a bit better result
than before while it's still slower than old MD one on TX:
---

with old MD driver:

on mac68k side:
---
 :
root file system type: ffs
Enter pathname of shell or RETURN for /bin/sh: 
We recommend creating a non-root account and using su(1) for root access.
No entry for terminal type "dumb";
using dumb terminal settings.
# mount -a -t nonfs
# ifconfig sn0 192.168.20.35
# dmesg|grep sn0
sn0 at obio0: integrated Ethernet adapter
sn0: Ethernet address 08:00:07:9f:07:c6
# ./ttcp -rs
ttcp-r: buflen=8192, nbuf=2048, align=16384/0, port=5001  tcp
ttcp-r: socket
ttcp-r: accept from 192.168.20.1
ttcp-r: 16777216 bytes in 19.33 real seconds = 847.75 KB/sec +++
ttcp-r: 2049 I/O calls, msec/call = 9.66, calls/sec = 106.02
ttcp-r: 0.0user 19.2sys 0:19real 99% 0i+0d 0maxrss 0+2pf 0+0csw
# ./ttcp -ts 192.168.20.1
ttcp-t: buflen=8192, nbuf=2048, align=16384/0, port=5001  tcp  -> 192.168.20.1
ttcp-t: socket
ttcp-t: connect
ttcp-t: 16777216 bytes in 15.93 real seconds = 1028.54 KB/sec +++
ttcp-t: 2048 I/O calls, msec/call = 7.96, calls/sec = 128.57
ttcp-t: 0.1user 15.4sys 0:15real 97% 0i+0d 0maxrss 0+4098pf 0+0csw
# 
---

on i386 side:
---
% dmesg|grep cpu0
cpu0 at mainbus0 apid 0: (boot processor)
cpu0: AMD Athlon 64 or Sempron (686-class), 2210.86 MHz, id 0x40ff2
 :
cpu0: "AMD Athlon(tm) 64 Processor 3500+"
 :
% uname -mrs
NetBSD 4.99.20 i386
% ttcp -ts 192.168.20.35
ttcp-t: buflen=8192, nbuf=2048, align=16384/0, port=5001  tcp  -> 192.168.20.35
ttcp-t: socket
ttcp-t: connect
ttcp-t: 16777216 bytes in 19.36 real seconds = 846.22 KB/sec +++
ttcp-t: 2048 I/O calls, msec/call = 9.68, calls/sec = 105.78
ttcp-t: -1.9user 0.0sys 0:19real 0% 0i+0d 0maxrss 0+4098pf 0+0csw
% ttcp -rs
ttcp-r: buflen=8192, nbuf=2048, align=16384/0, port=5001  tcp
ttcp-r: socket
ttcp-r: accept from 192.168.20.35
ttcp-r: 16777216 bytes in 15.97 real seconds = 1025.70 KB/sec +++
ttcp-r: 11586 I/O calls, msec/call = 1.41, calls/sec = 725.33
ttcp-r: 0.0user 0.0sys 0:15real 0% 0i+0d 0maxrss 0+2pf 0+0csw
% 
---


with MI driver:

on mac68k side:
---
 :
using dumb terminal settings.
# mount -a -t nonfs
# ifconfig sn0 192.168.20.35
# dmesg|grep sn0
sn0 at obio0: integrated SONIC Ethernet adapter
sn0: Ethernet address 08:00:07:9f:07:c6
# ./ttcp -rs
ttcp-r: buflen=8192, nbuf=2048, align=16384/0, port=5001  tcp
ttcp-r: socket
ttcp-r: accept from 192.168.20.1
ttcp-r: 16777216 bytes in 19.14 real seconds = 855.99 KB/sec +++
ttcp-r: 2049 I/O calls, msec/call = 9.57, calls/sec = 107.05
ttcp-r: 0.0user 19.0sys 0:19real 99% 0i+0d 0maxrss 0+2pf 0+0csw
# ./ttcp -ts 192.168.20.1
ttcp-t: buflen=8192, nbuf=2048, align=16384/0, port=5001  tcp  -> 192.168.20.1
ttcp-t: socket
ttcp-t: connect
ttcp-t: 16777216 bytes in 20.61 real seconds = 794.98 KB/sec +++
ttcp-t: 2048 I/O calls, msec/call = 10.30, calls/sec = 99.37
ttcp-t: 0.1user 20.4sys 0:20real 99% 0i+0d 0maxrss 0+4098pf 0+0csw
# 
---

on i386 side:
---
% ttcp -ts 192.168.20.35
ttcp-t: buflen=8192, nbuf=2048, align=16384/0, port=5001  tcp  -> 192.168.20.35
ttcp-t: socket
ttcp-t: connect
ttcp-t: 16777216 bytes in 19.18 real seconds = 854.25 KB/sec +++
ttcp-t: 2048 I/O calls, msec/call = 9.59, calls/sec = 106.78
ttcp-t: -1.9user 0.0sys 0:19real 0% 0i+0d 0maxrss 0+4098pf 0+0csw
% ttcp -rs
ttcp-r: buflen=8192, nbuf=2048, align=16384/0, port=5001  tcp
ttcp-r: socket
ttcp-r: accept from 192.168.20.35
ttcp-r: 16777216 bytes in 20.65 real seconds = 793.25 KB/sec +++
ttcp-r: 12273 I/O calls, msec/call = 1.72, calls/sec = 594.21
ttcp-r: 0.0user 0.0sys 0:20real 0% 0i+0d 0maxrss 0+2pf 0+0csw
% 
---

Summary:
	TX on sn0	RX on sn0
MD:	1026KB/s	 846KB/s
MI:  	 793KB/s	 854KB/s

- RX looks mostly the same.
  Maybe I forgot to update <m68k/types.h> then MI dp83932.c might
  do extra copies due to lack of __NO_STRICT_ALIGNMENT, and
  the bottleneck is in some upper layer?

- TX is still slower on MI driver.
  Maybe MI dp83932.c tries to set up too many DMA descriptors
  to send fragmented mbufs directly, and cache flush ops
  against such descriptors are more expensive than copying mbufs
  to uncached contiguous buffer?
  (if so, adding BUS_DMA_COHERENT support may improve performance)

> Since, as I understand, the MD driver does buffer-to-memory
> transfers by cpu, it may well lock out timer interrupts and lose clock
> ticks, possibly skewing your timing results.

Actually I see esp(4) driver on mac68k has such problem
(softclock seems blocked too much according to vmstat -i),
but MD mac68k/dev/if_sn.c doesn't have splhigh() at all
so I don't think it causes tick loss.
---
Izumi Tsutsui