Subject: Re: bcopy, bzero, copypage, and zeropage
To: Jason Thorpe <thorpej@nas.nasa.gov>
From: Chris G Demetriou <Chris_G_Demetriou@auchentoshan.pdl.cs.cmu.edu>
List: tech-kern
Date: 12/14/1996 13:24:55
> A much better approach is to gave a single function that does the
> comparisons only if configured for multiple CPU variants. If only
> configured for one, the comparisons aren't there (they're #ifdef'd).
>
> This eliminates unnecessary code size and overhead from indirect
> function calls in the critical path.
So, what you're saying is, there are two options, and they are:
"The conditional inclusion method:"
Generic kernel:
Direct call + potentially N tests and branches (per cpu type)
for every call.
Specific kernel:
Direct call.
"The N function method:"
Generic kernel:
Indirect function call.
Specific kernel:
Indirect function call.
However, you're missing a third options:
"The conditional inclusion N function method:"
At boot, pick which cpu-specific set of functions to use, fill in
their pointers. Then do whatever you were compiled to do, which is:
Generic kernel:
Indirect through the function pointers at each call.
Specific kernel:
Call the appropriate CPU-specific functions directly.
This is the best of both worlds (direct call in the specific-kernel
case, just one indirect function call overhead in the generic kernel
case). (I imagine that for many processors, (indirect branch to
subroutine) is going to be much quicker than (direct branch to
subroutine + N * (test + branch)).
This also has the nice side effect that you don't have to carefully
craft your #ifdefs so that all versions work, GENERIC or not...
chris