Subject: Re: Xscale optimisations
To: David Laight <david@l8s.co.uk>
From: Richard Earnshaw <rearnsha@arm.com>
List: port-arm
Date: 10/14/2003 15:40:51
rearnsha@arm.com said:
> Actually, the DNARD PAL comments suggest it's more complicated than
> that: AFAICT a cache line fill will take 14 clock ticks and a line
> write 12 clocks. 8 individual stores could take as many as 56
> clocks, so there would be a clear win to pre-fetching the line
> (potentially a factor 4 performance improvement).
Doh! 56 / (14 + 12) ~= 2 not 4. Still quite a potential win,
particularly for bzero.
R.