Subject: Re: bcopy optimisation
To: None <port-arm32@NetBSD.ORG>
From: Olly Betts <olly@mantis.co.uk>
List: port-arm32
Date: 07/09/1996 00:18:32
In traditional net style, I've just spotted an error I introduced in munging
the code into my previous mail.
Olly Betts writes:
>[snip]
>|_alignedwordcpy|
>|_alignedwordcpylp3|
> SUBS R2,R2,#4
> LDRGE R3,[ip],#4
> STRGE R3,[R1],#4
>; to unroll this loop, repeat these 3 instructions
> SUBGES R2,R2,#4
> LDRGE R3,[ip],#4
> STRGE R3,[R1],#4
>;
> BNE |_alignedwordcpylp3|
Make that: BGT |_alignedwordcpylp3|
> MOVS PC,R14
And it's actually 21% faster than SharedCLibrary on aligned word blocks (not
25%).
Olly