Subject: Re: Issue with large memory systems, and PPC overhead
To: Chuck Silvers <chuq@chuq.com>
From: Matt Thomas <matt@3am-software.com>
List: tech-kern
Date: 11/08/2002 13:09:37
At 12:54 AM 11/8/2002, Chuck Silvers wrote:
> % cumulative self self total
> time seconds seconds calls ns/call ns/call name
> 12.04 4.20 4.20 190077 22096.31 31792.03 pmap_remove
> 6.48 6.46 2.26 120044 18826.43 18826.43 vcopypage
> 6.42 8.70 2.24 160293 13974.41 13974.41 __syncicache
> 5.48 10.61 1.91 2011224 949.67 1088.58 pmap_pvo_enter
> 5.19 12.42 1.81 11607659 155.93 155.93 splx
>and on the 604:
> % cumulative self self total
> time seconds seconds calls us/call us/call name
> 15.49 1.78 1.78 13044 136.46 136.46 pmap_copy_page
> 9.23 2.84 1.06 19081 55.55 63.34 pmap_remove
> 7.05 3.65 0.81 16708 48.48 48.48 __syncicache
> 4.70 4.19 0.54 200028 2.70 4.21 pmap_pvo_enter
> 3.57 4.60 0.41 618995 0.66 0.66 splx
It's interesting to see that the Altivec version of pmap_copy_page
(vcopypage) is so much faster than the non-Altivec version.
Given that pmap_remove is 32us/call on the G4/400 and 63us on the
604ev/180 (roughly scaling with CPU speed), the difference between
19us for vcopypage and 136us for pmap_copy_page is amazing. That's
over 3 times as fast as you'd except pmap_copy_page to be on the G4.
--
Matt Thomas Internet: matt@3am-software.com
3am Software Foundry WWW URL: http://www.3am-software.com/bio/matt/
Cupertino, CA Disclaimer: I avow all knowledge of this message