Subject: Re: New kinetic figures
To: None <thorpej@zembu.com>
From: Richard Earnshaw <rearnsha@buzzard.freeserve.co.uk>
List: port-arm32
Date: 02/10/2001 13:52:31
> So, this will make the umapping of the entire address space happen
> while the process is not curproc. In fact, the unmap in the non-curproc
> path was already happening, but there was a redundant unmap in exit1().
>
> It made things ever so slightly faster on my 700MHz P-III -- I wasn't
> expecting to see much improvement on that system :-) Anyway, please
> try it on your ARM systems and tell me what improvement you see.
>
When configuring GNU Make, this cuts the number of full cache flushes in pmap_remove by 70%, and the number of partial flushes by 50%, and, with the other changes to the ARM pmap that I have, finally moves sa110_cache_purgeID off the number one spot in the profile graph I have. Top of the list is now bcopy_page, followed up closely by uvm_fault.
In terms of overall performance, before I started hacking the ARM pmap a profiled run of GNU Make's configure script was taking 3m9s wall-clock on an otherwise idle system. The same job is now taking 2m45s, and it is noticeable that we have:
1) Halved the total number of cache flushes.
2) Halved the number of calls to raisespl/splx
So I think your change is definitely a good move.
Richard.
Top ten routines prior to changes (excluding mcount)
15.00 21.82 21.82 _mcount
6.91 31.87 10.05 24025 418.31 418.31 sa110_cache_purgeID
4.64 38.62 6.75 35528 189.99 189.99 bcopy_page
4.39 45.01 6.39 123484 51.75 353.57 uvm_fault
3.79 50.52 5.51 2139083 2.58 2.58 splx
3.49 55.60 5.08 2118976 2.40 2.40 raisespl
2.98 59.93 4.33 539910 8.02 9.12 pmap_vac_me_harder
2.83 64.05 4.12 41802 98.56 98.56 bzero_page
2.66 67.92 3.87 196475 19.70 197.54 data_abort_handler
2.49 71.55 3.63 mcount
2.47 75.14 3.59 719901 4.99 4.99 lockmgr
2.45 78.70 3.56 253677 14.03 24.30 pmap_enter_pv
Top ten routines after all changes
14.56 18.85 18.85 _mcount
5.82 26.38 7.53 37590 200.32 200.32 bcopy_page
4.46 32.15 5.77 130557 44.20 260.67 uvm_fault
4.35 37.78 5.63 12728 442.33 442.33 sa110_cache_purgeID
3.26 42.00 4.22 1066604 3.96 3.96 splx
3.21 46.15 4.15 202325 20.51 167.71 data_abort_handler
3.17 50.25 4.10 39890 102.78 102.78 bzero_page
3.01 54.14 3.89 287465 13.53 27.63 pmap_enter
3.00 58.02 3.88 746751 5.20 5.23 lockmgr
2.80 61.64 3.62 1850805 1.96 1.96 pmap_pte
2.71 65.15 3.51 mcount
2.60 68.52 3.37 264467 12.74 12.74 pmap_enter_pv