Subject: Re: Bug in x86 ioapic interrupt code for devices with shared interrupts?
To: None <tls@rek.tjls.com>
From: Garrett D'Amore <garrett_damore@tadpole.com>
List: tech-kern
Date: 03/03/2006 13:25:21
Thor Lancelot Simon wrote:
> On Fri, Mar 03, 2006 at 11:37:31AM -0800, Jonathan Stone wrote:
>
>> Yes, the interrupt handler in bge(4) (sys/dev/pci/if_bge.c:bge_intr())
>> is known to give an inaccurate return code. That problem can cause
>> interrupts to not be forwarded to other devices sharing the same IRQ.
>> This is a long-known bug in bge(4). However, every time I've tried
>> to turn on the
>> #ifdef notdef"
>>
>> code in bge_intr(), the resulting kernel hung. If I had a
>> programmer's manual, I'd go looking for ways to ascertain if the bge
>> really interrupted.
>>
>
> Leaving the other issues aside: why should the bge not always claim that
> the interrupt was _not_ for it? Because that would generate "spurious
> interrupt N" messages on the system console?
>
In other operating systems (Solaris), not claiming an interrupt is poor
form. So poor, that the Solaris framework detects this as a "stuck
interrupt" and will disable that interrupt at the interrupt controller
or processor.
If you're going to interrupt (i.e. the hardware is going to), you had
best be ready to claim it.
> Perhaps we need a way for the driver to respond "I don't know" in such
> cases, so the interrupt is still passed down the chain, but nothing
> complains.
>
Hmm... apart from the performance impacts, is there a compelling reason
not to *always* forward the interrupt down the chain?
If you don't do this, a device which interrupts infrequently can get
totally starved off by a device that interrupts *very* frequently.
(Granted, such a device would likely make the rest of the system
unusable...)
-- Garrett
> Thor
>
--
Garrett D'Amore, Principal Software Engineer
Tadpole Computer / Computing Technologies Division,
General Dynamics C4 Systems
http://www.tadpolecomputer.com/
Phone: 951 325-2134 Fax: 951 325-2191