Subject: Re: small pmap.c change proposed
To: Toru Nishimura <nisimura@itc.aist-nara.ac.jp>
From: Simon Burge <simonb@netbsd.org>
List: port-mips
Date: 03/30/2000 00:54:09
Toru Nishimura wrote:

> Folks,
> 
> Will the following hack make any performance difference with yours?

Only tested in a R4400, but if anything just slightly worse.  I'll test
on a 5000/240 tomorrow.  Here's a kernel a few days old (1.4W), one with
your patch (the first 1.4X) and a stock 1.4X last, which two runs of
each.  I've deleted benchmark results that a reasonably identical.

                 L M B E N C H  1 . 9   S U M M A R Y
                 ------------------------------------
                 (Alpha software, do not distribute)

Processor, Processes - times in microseconds - smaller is better
----------------------------------------------------------------
Host                 OS  Mhz null null      open selct sig  sig  fork exec sh  
                             call  I/O stat clos       inst hndl proc proc proc
--------- ------------- ---- ---- ---- ---- ---- ----- ---- ---- ---- ---- ----
pmax-netb   NetBSD 1.4W  117  4.3  22.  121  141 0.31K  9.2   25 5.8K  43K  67K
pmax-netb   NetBSD 1.4W  117  4.3  22.  121  139 0.31K  9.2   25 6.6K  43K  68K
pmax-netb   NetBSD 1.4X  118  3.5  18.  106  121 0.31K  8.6   26 6.3K  43K  68K
pmax-netb   NetBSD 1.4X  118  3.5  18.  104  118 0.31K  8.6   26 6.7K  43K  69K
pmax-netb   NetBSD 1.4X  118  3.5  18.  101  128 0.31K  8.3   26 6.0K  42K  68K
pmax-netb   NetBSD 1.4X  118  3.5  19.  116  121 0.31K  8.3   26 6.7K  43K  69K

*** System call overhead seems to have come down slightly between 1.4W
and 1.4X.

Context switching - times in microseconds - smaller is better
-------------------------------------------------------------
Host                 OS 2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K
                        ctxsw  ctxsw  ctxsw ctxsw  ctxsw   ctxsw   ctxsw
--------- ------------- ----- ------ ------ ------ ------ ------- -------
pmax-netb   NetBSD 1.4W   21    157    554   223   1261     282    1602
pmax-netb   NetBSD 1.4W   23    227    502   243   1190     259    1763
pmax-netb   NetBSD 1.4X   22    146    876   228   1098     264    1558
pmax-netb   NetBSD 1.4X   23    156    733   207   1158     313    1580
pmax-netb   NetBSD 1.4X   31    166    482   206    888     253    1313
pmax-netb   NetBSD 1.4X   33    148    929   170    978     225    1423

*** Results a bit randomish, but for larger context switches stock 1.4X
is best.

*Local* Communication latencies in microseconds - smaller is better
-------------------------------------------------------------------
Host                 OS 2p/0K  Pipe AF     UDP  RPC/   TCP  RPC/ TCP
                        ctxsw       UNIX         UDP         TCP conn
--------- ------------- ----- ----- ---- ----- ----- ----- ----- ----
pmax-netb   NetBSD 1.4W    21   108  110                             
pmax-netb   NetBSD 1.4W    23   108  121                             
pmax-netb   NetBSD 1.4X    22   123  125                             
pmax-netb   NetBSD 1.4X    23   184  124                             
pmax-netb   NetBSD 1.4X    31   127  125                             
pmax-netb   NetBSD 1.4X    33   127  128                             

File & VM system latencies in microseconds - smaller is better
--------------------------------------------------------------
Host                 OS   0K File      10K File      Mmap    Prot    Page       
                        Create Delete Create Delete  Latency Fault   Fault 
--------- ------------- ------ ------ ------ ------  ------- -----   ----- 
pmax-netb   NetBSD 1.4W   3030   1162   5555   2857   320222     5    9.1K
pmax-netb   NetBSD 1.4W   2857   1149   4761   3333   317606     5    9.2K
pmax-netb   NetBSD 1.4X   2127   1282   4545   3703   160102          7.6K
pmax-netb   NetBSD 1.4X   2777   1149   4761   3225   160139          7.7K
pmax-netb   NetBSD 1.4X   2777   1149   5263   3125   161697          8.1K
pmax-netb   NetBSD 1.4X   2777   1123   5263   2941   159695          7.8K

*** Really interesting result here - mmap latency has almost halved
between 1.4W and 1.4X and page faults are ~10% quicker.

Simon.