Subject: openssh-3.9.1: request for Makefile flag to link with ports wqcrypto
To: None <tech-pkg@NetBSD.org>
From: S. C. Sprong <s.c.sprong@student.utwente.nl>
List: tech-pkg
Date: 04/25/2005 21:50:05
PKGNAME=                openssh-3.9.1
MAINTAINER=             tech-pkg@NetBSD.org


Hello,

I have a request for the openssl and openssh ports in the NetBSD pkg system.
I'd really, really like to see an _easy to manipulate_ option/flag to link
with libcrypto from either the base system or the port system. As always I
build custom ports, but this gets old.

Included are some benchmark results I gathered over the years: a comparison
between v7 and v8 code on an SS 5@110 and a comparision between v8 code on
NetBSD 1.6.2, libsparc_v8 on NetBSD 2.0.2 and -mv8 on NetBSD 2.0.2

The libsparc_v8 in NetBSD is a long-awaited very nice idea, but currently a
'pure' -mv8 compile is 60% faster for ssh.
However, OpenSSH_3.9p1 + OpenSSL 0.9.7d 17 Mar 2004 is even 340% faster!


regards,
scs

NB What I'd really like to see is an implementation of FreeBSD's per library
mapping feature (/etc/libmap.conf).


8<--

A) Host: Sun SparcStation 5 MicroSparc@110 MHz, 64 MB RAM
   Vanilla Sparc v7 build:

OpenSSL 0.9.6g 9 Aug 2002
built on: NetBSD 1.6
options:bn(32,32) md2(int) rc4(ptr,int) des(idx,cisc,16,long) blowfish(ptr)
compiler: gcc version 2.95.3 20010315 (release) (NetBSD nb3)
The 'numbers' are in 1000s of bytes per second processed.
type              8 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
md2                 33.05k       94.26k      128.59k      138.06k      142.93k
mdc2                95.70k      101.92k      102.62k      103.07k      102.88k
md4                634.93k     3174.94k     6259.75k     8248.42k     9091.30k
md5                556.59k     2467.27k     4812.12k     6263.50k     6895.93k
hmac(md5)          163.35k     1068.10k     2920.60k     5180.23k     6658.81k
sha1               316.04k     1166.72k     2235.02k     2819.75k     3162.26k
rmd160             233.05k      651.77k     1556.03k     2352.89k     2772.72k
rc4               3181.84k     3513.31k     3539.89k     3591.21k     3879.14k
des cbc            622.97k      770.29k      781.53k      800.37k      790.36k
des ede3           319.35k      351.19k      356.52k      358.29k      353.88k
idea cbc             0.00         0.00         0.00         0.00         0.00
rc2 cbc            772.04k      925.45k      946.85k      942.38k      927.78k
rc5-32/12 cbc        0.00         0.00         0.00         0.00         0.00
blowfish cbc      1069.08k     1757.64k     1870.64k     1941.32k     1797.44k
cast cbc          1179.95k     1470.76k     1547.42k     1613.47k     1496.08k
                  sign    verify    sign/s verify/s
rsa  512 bits   0.4627s   0.0456s      2.2     22.0
rsa 1024 bits   2.8837s   0.1639s      0.3      6.1
rsa 2048 bits  19.6234s   0.6068s      0.1      1.6
                  sign    verify    sign/s verify/s
dsa  512 bits   0.4569s   0.5693s      2.2      1.8
dsa 1024 bits   1.5994s   1.9861s      0.6      0.5
dsa 2048 bits   5.9440s   7.2612s      0.2      0.1


B) Host: Sun SparcStation 5 MicroSparc@110 MHz, 64 MB RAM
   Custom Sparc v8 ports build:

OpenSSL 0.9.6g 9 Aug 2002
built on: Sun Sep 29 21:44:40 CEST 2002
options:bn(32,32) md2(int) rc4(ptr,int) des(idx,cisc,4,long) idea(int) blowfish(idx)
compiler: gcc -fPIC -DDSO_DLFCN -DHAVE_DLFCN_H -DTERMIOS -Wall -DB_ENDIAN -O2 -mv8 -pipe
The 'numbers' are in 1000s of bytes per second processed.
type              8 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
md2                 49.95k      141.03k      189.85k      207.44k      213.92k
mdc2               137.73k      146.52k      146.06k      146.05k      146.52k
md4                839.91k     4113.74k     7458.57k     9367.53k    10068.03k
md5                585.86k     2867.28k     5149.70k     6469.33k     6958.57k
hmac(md5)          232.86k     1459.29k     3591.04k     5679.59k     6833.39k
sha1               359.02k     1638.08k     2890.28k     3547.19k     3823.30k
rmd160             267.10k      691.02k     1619.63k     2375.95k     2796.32k
rc4               2684.01k     3004.21k     3055.36k     3036.18k     3012.18k
des cbc            759.14k      871.94k      897.77k      900.39k      880.45k
des ede3           293.99k      322.92k      323.14k      323.80k      321.17k
idea cbc           541.38k      610.67k      617.41k      622.85k      615.09k
rc2 cbc            762.13k      906.38k      922.76k      921.70k      917.50k
rc5-32/12 cbc     1598.81k     2391.49k     2522.92k     2552.25k     2574.12k
blowfish cbc      1267.63k     1643.98k     1686.05k     1716.26k     1822.96k
cast cbc          1101.13k     1358.36k     1415.02k     1415.70k     1374.88k
                  sign    verify    sign/s verify/s
rsa  512 bits   0.1010s   0.0096s      9.9    104.0
rsa 1024 bits   0.6005s   0.0334s      1.7     29.9
rsa 2048 bits   4.0009s   0.1218s      0.2      8.2
rsa 4096 bits  28.3344s   0.4576s      0.0      2.2
                  sign    verify    sign/s verify/s
dsa  512 bits   0.0970s   0.1217s     10.3      8.2
dsa 1024 bits   0.3297s   0.4106s      3.0      2.4
dsa 2048 bits   1.1862s   1.4777s      0.8      0.7


C) Host: Sun SparcStation 5 Turbosparc@170 MHz, 256 MB RAM
   Custom Sparc v8 buildworld:

OpenSSL 0.9.6g 9 Aug 2002
built on: NetBSD 1.6.2
options:bn(32,32) md2(int) rc4(ptr,int) des(idx,cisc,16,long) blowfish(ptr)
compiler: gcc version 2.95.3 20010315 (release) (NetBSD nb3)
The 'numbers' are in 1000s of bytes per second processed.
type              8 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
md2                100.37k      279.34k      377.26k      411.16k      426.90k
mdc2                 0.00         0.00         0.00         0.00         0.00
md4               1553.85k     6159.94k    11420.96k    14497.55k    15805.91k
md5               1150.16k     5008.06k     8443.01k    10152.14k    10806.34k
hmac(md5)          358.87k     2015.71k     5211.22k     8525.35k    10572.05k
sha1               579.40k     2051.64k     3755.82k     4804.50k     5182.01k
rmd160             387.28k     1223.48k     2717.98k     3891.83k     4369.62k
rc4               4919.96k     5953.30k     6039.28k     6026.44k     6050.00k
des cbc           1179.88k     1417.17k     1431.07k     1445.06k     1432.92k
des ede3           468.11k      499.06k      499.89k      502.68k      501.61k
idea cbc             0.00         0.00         0.00         0.00         0.00
rc2 cbc           1369.97k     1484.20k     1508.01k     1502.48k     1494.19k
rc5-32/12 cbc        0.00         0.00         0.00         0.00         0.00
blowfish cbc      2437.57k     3111.63k     3226.91k     3274.20k     3230.09k
cast cbc          2202.19k     2561.69k     2601.83k     2621.58k     2601.19k
                  sign    verify    sign/s verify/s
rsa  512 bits   0.0432s   0.0037s     23.1    272.8
rsa 1024 bits   0.2282s   0.0120s      4.4     83.6
rsa 2048 bits   1.4237s   0.0420s      0.7     23.8
rsa 4096 bits  10.0924s   0.1566s      0.1      6.4
                  sign    verify    sign/s verify/s
dsa  512 bits   0.0370s   0.0456s     27.0     22.0
dsa 1024 bits   0.1179s   0.1471s      8.5      6.8
dsa 2048 bits   0.4122s   0.5120s      2.4      2.0


D) Host: Sun SparcStation 5 Turbosparc@170 MHz, 256 MB RAM
   Vanilla system libsparc_v8.so:

OpenSSL 0.9.7d 17 Mar 2004
built on: NetBSD 2.0.2
options:bn(32,32) md2(int) rc4(ptr,int) des(idx,cisc,16,long) aes(partial) blowfish(ptr)
compiler: gcc version 3.3.3 (NetBSD nb3 20040520)
available timing options: USE_TOD HZ=100 [sysconf value]
timing function used: getrusage
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
md2                 74.64k      166.29k      240.17k      271.49k      281.38k
mdc2                 0.00         0.00         0.00         0.00         0.00
md4                354.01k     1150.76k     3717.10k     8609.83k    13755.30k
md5                238.46k      800.32k     2619.16k     6026.51k     9891.19k
hmac(md5)          542.25k     1717.23k     4648.55k     8159.36k    10376.66k
sha1               222.56k      641.10k     1888.73k     3657.48k     4970.64k
rmd160             220.46k      529.46k     1518.35k     2970.46k     4083.62k
rc4               6443.25k     7117.33k     7305.54k     7402.10k     7401.41k
des cbc           1375.43k     1488.95k     1513.33k     1532.70k     1546.83k
des ede3           523.13k      537.83k      535.94k      544.59k      540.32k
idea cbc             0.00         0.00         0.00         0.00         0.00
rc2 cbc           1414.11k     1470.70k     1482.73k     1486.07k     1485.98k
rc5-32/12 cbc        0.00         0.00         0.00         0.00         0.00
blowfish cbc      2909.96k     3151.14k     3240.50k     3268.46k     3279.17k
cast cbc          2580.72k     2777.31k     2842.56k     2856.51k     2842.90k
aes-128 cbc       1749.56k     1812.15k     1829.84k     1832.48k     1828.52k
aes-192 cbc       1581.66k     1594.55k     1583.81k     1597.83k     1593.79k
aes-256 cbc       1370.86k     1455.06k     1446.81k     1455.02k     1455.78k
                  sign    verify    sign/s verify/s
rsa  512 bits   0.0674s   0.0061s     14.8    163.4
rsa 1024 bits   0.3684s   0.0202s      2.7     49.6
rsa 2048 bits   2.4006s   0.0722s      0.4     13.9
rsa 4096 bits  16.8644s   0.2702s      0.1      3.7
                  sign    verify    sign/s verify/s
dsa  512 bits   0.0580s   0.0712s     17.2     14.1
dsa 1024 bits   0.1940s   0.2380s      5.2      4.2
dsa 2048 bits   0.6976s   0.8721s      1.4      1.1


E) Host: Sun SparcStation 5 Turbosparc@170 MHz, 256 MB RAM
   Custom Sparc v8 ports build with libsparc_v8.so tagging along:

OpenSSL 0.9.7f 22 Mar 2005
built on: Mon Apr 25 17:49:09 CEST 2005
options:bn(64,32) md2(char) rc4(idx,int) des(idx,cisc,16,long) aes(partial) idea(int) blowfish(idx)
compiler: gcc -fPIC -DDSO_DLFCN -DHAVE_DLFCN_H -DOPENSSL_NO_KRB5 -O2 -O2 -mv8 -pipe -DTERMIOS -DB_ENDIAN -O2 -Wall
available timing options: USE_TOD HZ=100 [sysconf value]
timing function used: getrusage
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
md2                 96.65k      216.96k      314.24k      351.10k      367.40k
mdc2               140.30k      183.09k      200.54k      205.61k      206.50k
md4                415.26k     1312.31k     4181.26k     9266.45k    14267.99k
md5                288.06k      907.24k     2892.02k     6445.15k     9947.15k
hmac(md5)          574.49k     1587.68k     4417.18k     7938.15k    10407.59k
sha1               259.98k      891.78k     2489.95k     4459.65k     5847.41k
rmd160             239.64k      510.78k     1536.43k     3028.48k     4213.07k
rc4               7470.98k     7996.54k     8142.08k     8184.42k     8015.81k
des cbc           1279.87k     1401.27k     1432.64k     1436.88k     1437.10k
des ede3           519.62k      534.42k      540.45k      535.70k      537.53k
idea cbc          1477.33k     1550.27k     1562.70k     1584.39k     1562.17k
rc2 cbc           1415.32k     1465.32k     1478.03k     1493.08k     1474.18k
rc5-32/12 cbc     3508.78k     4213.26k     4440.08k     4485.10k     4491.98k
blowfish cbc      2725.56k     3127.84k     3207.92k     3246.63k     3231.00k
cast cbc          2355.66k     2524.59k     2581.11k     2601.12k     2568.21k
aes-128 cbc       2299.03k     2416.92k     2465.08k     2470.61k     2434.06k
aes-192 cbc       1902.31k     2016.56k     2052.87k     2059.21k     2035.31k
aes-256 cbc       1740.87k     1857.40k     1889.00k     1895.11k     1878.46k
                  sign    verify    sign/s verify/s
rsa  512 bits   0.0189s   0.0018s     52.9    559.2
rsa 1024 bits   0.0920s   0.0049s     10.9    202.7
rsa 2048 bits   0.5560s   0.0158s      1.8     63.1
rsa 4096 bits   3.5873s   0.0554s      0.3     18.1
                  sign    verify    sign/s verify/s
dsa  512 bits   0.0156s   0.0185s     64.0     54.0
dsa 1024 bits   0.0453s   0.0558s     22.1     17.9
dsa 2048 bits   0.1503s   0.1780s      6.7      5.6

8<--