Subject: Interrupts as threads
To: None <tech-kern@netbsd.org>
From: Andrew Doran <ad@netbsd.org>
List: tech-smp
Date: 12/01/2006 23:31:19
Hi,

I have been thinking about the lock ordering problem with the kernel big
lock quite a bit and what it will take to lock the MI kernel down, and have
made some observations.

o There is no easy solution to the lock order problem with the kernel_lock
  when using spin locks.

o Using spin locks we will have to keep the SPL above IPL_NONE for longer
  that before, or accept (in non-trivial cases) the undesirable cost of
  having both interrupt and process context locks around some objects.

o Raising and lowering the SPL is expensive, especially on machines that
  need to talk with the hardware on SPL operation. The spin lock path also
  has more test+branch pairs / conditional moves and memory references
  involved than process locks. For a process context lock, the minimum we
  can get away with on entry and exit is one test+branch and two cache line
  references.

o Every spin lock / unlock pair denotes a critical section where threads
  running in the kernel can not be preempted. That's not currently an issue
  but if we move to support real time threads it could become one; I'm not
  sure.

o We are doing too much work from interrupt context.

The cleanest way to deal with these issues that I can see is to use
lightweight threads to handle interrupts. My initial thought is to have one
thread per level, per CPU. These would be able to preempt already running
threads, and would hold preempted threads in-situ until the interrupt thread
returns or switches away. In most cases, SPL operations would be replaced by
locks. Blocking would no longer be prohibited, but strongly discouraged - so
doing something like pool_get(foo, PR_WAITOK) should likely trigger an
assertion.

On something like an x86 or MIPS CPU, we wouldn't need to do a full context
switch for interrupts, just switch onto another stack. For things that are
time critical like clock or audio ISRs I think the current scheme of
deferring the interrupts might be better. Although it should not be common,
the delay involved in switching away when trying to acquire a lock seems
undesirable in these cases. (As an aisde, I have been meaning to do some
profiling to see just how often the SPL operations serve their purpose in a
variety of cases but haven't gotten around to it yet.)

Assuming you subscribe to handling interrupts with threads, it raises the
question: where to draw the line between threaded and 'traditional'. It
certianly makes sense to run soft interrupts this way, and I would draw the
line at higher priority ISRs like network, audio, serial and clock.

Thoughts?

Cheers,
Andrew