Subject: Re: improving ssh performance on sun4m systems
To: None <port-sparc@netbsd.org>
From: Thilo Manske <Thilo.Manske@HEH.Uni-Oldenburg.DE>
List: port-sparc
Date: 03/14/2002 11:14:12
On Wed, Mar 13 2002 at 18:25:43 -0800, Aaron J. Grier wrote:
> pkgsrc/devel/cpuflags can help. on my 110MHz sparc 5 running 1.5.2,
> cpuflags adds '-mcpu=supersparc' which almost doubles my dsa
> performance:
[...]
> so gcc is obviously producing much better code on this machine when
> -mcpu=supersparc is used. and this is with the older compiler...
BTW: You get sightly better results using -mv8 than -msupersparc on s SS5 (I
think -msupersparc doesn't only use the v8 instcructions, it tunes the code
for superscalar execution for supersparc cpus as well).
> gcc version egcs-2.91.66 19990314 (egcs-1.1.2 release)
>
> it wouldn't surprise me if the newer gcc is even better about optimizing
> and / or tuning.
Well, if you mean 2.95, then I must say in the not-tuned case it's often
worse and I think in your case (µSparc II) even with tuned code.
I did some tests early this year since I noticed some performace loss after
upgrading my compilers to the new toolchain (that's why I don't have results
for the old compiler in the tuned cases...). FWIW here are the not that
scientific results using dhrystone as benchmark:
dhrystone libc (compiled with standard netbsd flags)
SPARCStation IPC (25MHz MB86901):
compiler flags version compiler dhrystones/s
2.91 -O2 12.61 2.92 24950 < default² (before)
2.91 -O2 12.81 2.95 20080
2.95 -O2 12.81 2.95 20180 < default (after)
2.95 -O2 -mypress(*) 12.81 2.95 20260
=> Which makes a performance decrease of about 20% using the default flags :-(
SPARCStation IPX (40MHz MB86903):
2.91 -O2 12.61 2.92 42350 < default² (before)
2.91 -O2 12.81 2.95 30070
2.95 -O2 12.81 2.95 31880 < default (after)
2.95 -O2 -mypress(*) 12.81 2.95 32000
=> ~-25% in the default case
And now for some Sun4m systems:
SPARCStation Classic [X] (50MHz µSparc):
2.91 -O2 12.61 2.92 46080 < default² (before)
2.91 -O2 12.81 2.95 45170
2.95 -O2 12.81 2.95 51280 < default (after)
2.95 -O2 -mv8 12.81 2.95 58280
2.95 -O2 -msupersparc(*) 12.81 2.95 58140
=> ~+10%
SPARCStation 4/85 (85Hz µSPARC II):
2.91 -O2 12.61 2.92 95240 < default³ (before)
2.91 -O2 12.61 2.92 90740 < default² (before)
2.91 -O2 12.81 2.95 82920
2.95 -O2 12.81 2.95 87560 < default (after)
2.95 -O2 -mv8 12.81 2.95 113400
2.95 -O2 -msupersparc(*) 12.81 2.95 110000
=> ~-10%
And on a SPARCStation 20/71 (75MHz Supersparc II):
2.91 -O2 12.61 2.92 112600 < default² (before)
2.91 -O2 12.81 2.95 119800
2.95 -O2 12.81 2.95 118700 < default (after)
2.95 -O2 -mv8 12.81 2.95 127700
2.95 -O2 -msupersparc(*) 12.81 2.95 127400
=> ~+5%
Remarks:
- "before"/"after" is before/after I switched to the old toolchain
- The old dhrystone binary I found on my system was compiled April 17 2000,
the libc.12.61 May 2000
- tests were repeated 3 times with 1E6 runs, the average was taken and rounded
- all Kernels for Sun4m systems were compiled with "-mv8"
*) this was the suggested optimization of pkg devel/cpuflags for that machine
²) but with new kernel (gcc 2.95 compiled, NetBSD 1.5ZA), results may be
better with older kernel of matching date (see SPARCStation 4 results) but
don't ask me why... (I think The dhrystone benchmark doesn't do much syscalls)
³) with old kernel (gcc 2.92 compiled, NetBSD 1.5W from June)
BTW: On MIPS the situation is similar, I hope GCC3 generated code will be
back to the 2.92 quality (in terms of execution speed).
--
Dies ist Thilos Unix Signature! Viel Spass damit.