Subject: Re: Accelerating memset/memcpy
To: None <cgd@broadcom.com>
From: Nigel Stephens <nigel@mips.com>
List: port-mips
Date: 10/01/2002 18:20:15
cgd@broadcom.com wrote:
>Note the MIPS32 and MIPS64 specs also include the following in their
>description of the 'pref' opcode, which are inconsistent with
>PrepareForStore's description:
>
>* "The action taken for a specific PREF instruction is both system and
> context dependent. Any action, including doing nothing, is
> permitted as long as it does not change architecturally visible
> state or alter the meaning of a program."
>
>* "A hint value cannot cause an action to modify architecturally
> visible state."
>
>(Zeroing a line of memory is most definitely a modification of
>architecturally visible state. 8-)
>
Right, you obviously shouldn't rely on it doing anything at all, and in
particular shouldn't rely on its side-effect of clearing the line to
zero (i.e. to do an ultra-fast bzero!).
> (I mention this because, well,
>hey, you're a channel that might be used to get documentation fixes
>back in. Those are from MIPS64 Volume II, rev 0.95, page 243.)
>
>
>
Sure, I'll feed this back to the author.
>Anyway, despite the pseudo-standardization of the 'hint' fields
>("pseudo" because "any action, including doing nothing, is permitted")
>because of:
>
>* historical differences from the standardized hints,
>
>* differences in even MIPS32/MIPS64 processors about which are
> implemented and how, and, of course,
>
>* microarchitectural differences,
>
>it really doesn't make sense to try to apply a blanket
>'mips32/mips64-optimized' memcpy (et al) to the kernel. They really
>should be selected on a per-cpu basis.
>
>
Good point. So there's at least three alternatives to start with: "pref
30", "create_dirty_exclusive" and "none". Let's hope that covers most of
them - now we just have to do a survey! But at least "pref 30" should be
a safe initial assumption for any MIPS32/MIPS64 processor.
Nigel