dual-core i386, current, SCHED_M2 Running a build of pkgsrc packages, the system seemed to stop making any further progress at one point, but was still (sluggishly) responsive. top(1) showed that all the cpu time was spinning between 4 gmake processes, getting ~50% cpu time (in sys) each. It looked like some of the other processes they had spawned weren't getting any time to make progress, and the gmakes were somehow busy-waiting on them. renicing the gmakes down didn't seem to help. pkill -STOP'ing them, and then selectively kill -CONT'ing them, allowed further progress. It looks to me like the scheduler is repeatedly picking "the other gmake" as the next process to run that's waiting on something, and never running the things they're actually waiting for. It may be exacerbated by something in the way gmake waits for children or pipes (every so often, top show one of them in pipe rather than CPU state). Maybe there's even something else at play (from vmlocking?) that causes gmake to use so much cpu spinning -- but whatever that is, there's clearly also a fair-scheduling problem as well. I'll experiment further (say, with more or less MAKE_JOBS) to see how the problem moves around. With 3 of the 4 gmakes running and one stopped, i have 3 makes taking 50% of a cpu, and the rest of the cpu used by cc and others making some progress. -- Dan.
Attachment:
pgpkXutpGosY2.pgp
Description: PGP signature