Subject: Re: Moving scheduler semantics from cpu_switch() to kern_synch.c
To: matthew green <mrg@eterna.com.au>
From: Jason Thorpe <thorpej@shagadelic.org>
List: tech-kern
Date: 09/21/2006 10:33:03
On Sep 21, 2006, at 10:05 AM, matthew green wrote:
> so every process is bound to a cpu? i guess i don't understand how
> this works to avoid cpus idling while lwps are waiting for "their"
> cpu to become free... who runs a new process first? right now it
> is who ever tries to first.
There are strategies for handling this, and plenty of papers written
on the subject. Note that waiting for "their" CPU to become free can
actually perform BETTER than running on some other random CPU, because
you avoid all of the wasted cache usage and nasty cache coherency
logic that the hardware has to deal with.
David Black, I think, wrote a paper on this some years ago in the
context of Mach, described something called "processor sets":
David L. Black. Scheduling Support for Concurrency and
Parallelism in the Mach Operating System. IEEE Computer,
23(5):35-43, May 1990
The idea is that you have the notion of a "processor set", which
contains a list of "processors". The processor set contains the run
queues. A task is bound to a "processor set". This gives you a
convenient way to handle asymmetric processor topology (think: x86
HT), and you can have strict affinity by putting only one processor in
each set. Note this would actually work REALLY well for multi-HT CPU
systems... each processor set would get both virtual CPUs for each
physical CPU, and since it is the PROCESS that is bound to the
processor set, you ensure that the LWPs for that process are only run
on that one CPU, which is optimal for multi-threaded processes on HT
systems.
Note that I'm hand-waving the situation where a task's concurrently is
greater than the number of processors in its processor set ... I need
to re-read the paper ... but I figured I might as well start the
discussion :-)
Note that processor sets also provides infrastructure for dynamically
adding and removing CPUs.
As for "which processor set runs it first", you could implement a
number of policies -- random, round-robin, fewest-bound-tasks, lowest-
average-load, etc. There might even be some good paper material
here :-)
There is also literature out there that describes strategies for when
you decide to pay the penalty to migrate a task from one processor (or
processor set) to another, but I'd have to refresh my memory on them
before discussing it further. But, basically, the scheduler could
periodically look at the load on each CPU and perform some sort of re-
balancing.
Anyway, there are a lot of good lessons to be learned from other
systems, here...
-- thorpej