Subject: 1.6E stray irq for SCSI controller halts system.
To: None <port-alpha@netbsd.org>
From: Stephen M. Jones <smj@cirr.com>
List: port-alpha
Date: 08/03/2002 17:50:58
This morning I was working with a new disk array.. You might think I'm
brutal, but I generally generate file systems on disk arrays like this:
for i in sd1a sd2a sd3a sd4a sd5a sd6a sd7a
do
newfs -m 0 /dev/$i &
done
I do it like this because I feel its a simple test to be sure the controller
and disks work sanely.
So, while doing this I got a couple of stray interrupt complaints and repeately
after 3 of them, the operating system halted. I've seen this before on the
5305 (this is the API CS20 NetBSD 1.6E (SVERIGE) #0: Tue Jul 30 22:22:12 UTC 2002)
with a 3COM ethernet controller. Exactly the same scenario when you'd have stray
interrupts .. roughly about 10 or 15 .. the system would just lock up.
Info on the device:
ahc0 at pci0 dev 5 function 0
ahc0: interrupting at dec 6600 irq 24
ahc0: aic7880 Wide Channel A, SCSI Id=7, 16/255 SCBs
Debugger output:
cpu_Debugger() at cpu_Debugger+0x4
comintr() at comintr+0x168
alpha_shared_intr_dispatch() at alpha_shared_intr_dispatch+0x6c
sio_iointr() at sio_iointr+0x68
interrupt() at interrupt+0x304
XentInt() at XentInt+0x1c
--- interrupt (from ipl 0) ---
idle() at idle+0x70
idle() at idle+0x54
--- root of call graph ---
db{0}> buf
No such command
db{0}> show buf
CPU 0: fatal kernel trap:
CPU 0 trap entry = 0x4 (unaligned access fault)
CPU 0 a0 = 0xfffffc00004bacc4
CPU 0 a1 = 0x29
CPU 0 a2 = 0x11
CPU 0 pc = 0xfffffc00004103b4
CPU 0 ra = 0xfffffc00003a6784
CPU 0 pv = 0xfffffc00003dfc00
CPU 0 curproc = 0x0
Caught exception in ddb.
db{0}> show map
MAP 0xfffffc0000514550: [0xfffffe0000000000->0xffffffffffffe000]
#ent=12, sz=382664704, ref=1, version=593, flags=0x1
pmap=0xfffffc000055d5b8(resident=4604)
db{0}> show event
evcnt type 0: FP proc use = 227
evcnt type 0: FP proc re-use = 36354
evcnt type 1: soft serial = 3411
evcnt type 1: soft net = 45
evcnt type 1: soft clock = 1550
evcnt type 1: cpu0 clock = 981787
evcnt type 1: cpu0 device = 301316
evcnt type 1: cpu0 ipi = 424674
evcnt type 1: cpu0 shootdown ipi = 423958
evcnt type 1: cpu0 imb ipi = 343
evcnt type 1: cpu0 synch fpu ipi = 451
evcnt type 1: cpu0 discard fpu ipi = 31
evcnt type 1: cpu1 clock = 968933
evcnt type 1: cpu1 ipi = 9254
evcnt type 1: cpu1 microset ipi = 946
evcnt type 1: cpu1 shootdown ipi = 8096
evcnt type 1: cpu1 imb ipi = 160
evcnt type 1: cpu1 synch fpu ipi = 75
evcnt type 1: cpu1 discard fpu ipi = 3
evcnt type 1: cpu1 pause ipi = 2
evcnt type 1: isa irq 4 = 3413
db{0}> show registers
v0 0xf9 rn+0xd9
t0 0xfffffc000050e9ec db_fromconsole
t1 0x1
t2 0
t3 0xfffffc0000513b58 cn_magic
t4 0
t5 0xa42e3 rn+0xa42c3
t6 0x10556000
t7 0x10000 rn+0xffe0
s0 0xfffffe0000102e00
s1 0xfffffc000050f3c0 com_cnm_state
s2 0xfffffc0000559a68 tsp_configuration+0x18
s3 0xc6 rn+0xa6
s4 0xf9 rn+0xd9
s5 0xfffffd01fc0003f8
s6 0xfffffe0000117360
a0 0xfffffc0000559a50 tsp_configuration
a1 0xfffffd01fc0003f8
a2 0xfffffd01fc0003fd
a3 0xfffffc00028e9df0 end+0x238c270
a4 0
a5 0x109 rn+0xe9
t8 0xfffffc00005591d8 vm_physmem
t9 0xfffffc00004ab990 microtime+0xb0
t10 0x1ff289a2b14c0
t11 0x31aa4752
ra 0xfffffc0000320708 comintr+0x168
t12 0xfffffc00004bac20 cpu_Debugger
at 0x4
gp 0xfffffc0000508830 special_symbols+0x8160
sp 0xfffffc00028e9d10 end+0x238c190
pc 0xfffffc00004bac24 cpu_Debugger+0x4
ps 0x4
ai 0x31aa4752
pv 0xfffffc00004bac20 cpu_Debugger
cpu_Debugger+0x4: ret zero,(ra)