Subject: Re: ddb help
To: Manuel Bouyer <bouyer@antioche.eu.org>
From: Eduardo Horvath <eeh@NetBSD.org>
List: port-sparc64
Date: 10/25/2004 18:46:53
On Sun, Oct 24, 2004 at 10:19:02PM +0200, Manuel Bouyer wrote:
> Hi,
> I can't understand how this can happen. Is it possible that ddb is printing
> the wrong address here, or is missing a function call in the stack frame ?
> This is a current GENERIC32 kernel, recompiled with -g
Stack traces are done by traversing the register windows saved to the stack
and printing out the linkage pointers. It is possible that the register
windows were never saved to the stack, they were overwritten, the stack
pointer is pointing to the wrong place, or there have been some tail calls
a$nd the bottom register window has been recycled. In this instance is
most likely the latter.
>
> text_access_fault: pc=0 va=0
> kernel trap 64: +fast instruction access MMU miss
> Stopped in pid 4.1 (atabus0) at 0: undefined
> db> tr
> wdc_ata_bio_start(1dd4150, 1e34000, 0, 0, 0, 1dd4180) at netbsd:wdc_ata_bio_star
> t+0x48c
> atabus_thread(0, 0, ffff, 5bcd, 0, 0) at netbsd:atabus_thread+0x128
> proc_trampoline(0, 0, 0, 0, 0, 0) at netbsd:proc_trampoline+0x4
> db> examine/i wdc_ata_bio_start+0x480
> netbsd:wdc_ata_bio_start+0x480: st %g1, [%l3 + 0x34]
> db>
> netbsd:wdc_ata_bio_start+0x484: stb %g0, [%l0 + 0xa]
> db>
> netbsd:wdc_ata_bio_start+0x488: or %g0, %i0, %o0
> db>
> netbsd:wdc_ata_bio_start+0x48c: call netbsd:wdc_ata_bio_done
> db>
Here's a call to netbsd:wdc_ata_bio_done. It probably calls something
else just before returning, so that call never got its own stack frame.
> netbsd:wdc_ata_bio_start+0x490: or %g0, %i1, %o1
> db>
> netbsd:wdc_ata_bio_start+0x494: ldd [%l2 + 0xc0], %o4
> db>
> netbsd:wdc_ata_bio_start+0x498: ldsb [%l2 + 0xc8], %g4
> db>
> netbsd:wdc_ata_bio_start+0x49c: or %g4, 0x9, %g1
> db>
> netbsd:wdc_ata_bio_start+0x4a0: subcc %g1, 0x1d, %g0
> db> show registers
> tstate 0x44
> pc 0
> npc 0
> ipl 0x5
> y 0
> g0 0
> g1 0
> g2 0
> g3 0
> g4 0
> g5 0
> g6 0
> g7 0xffffffff
> o0 0
> o1 0
> o2 0
> o3 0
> o4 0
> o5 0
> o6 0
> o7 0
> l0 0
> l1 0
> l2 0
> l3 0
> l4 0
> l5 0
> l6 0
> l7 0
> i0 0
> i1 0
> i2 0
> i3 0
> i4 0
> i5 0
> i6 0
> i7 0
Hm. The registers here seem invalid. I can't
believe they can all be zero.
> f0 0x3fb96ced
> f2 0xffffffff
> f4 0xffffffff
> f6 0xffffffff
> f8 0x3fb0c299
> f10 0x3eef7510
> f12 0x41e00000
> f14 0x3ff00000
> f16 0
> f18 0x3eef7510
> f20 0x40000
> f22 0xffffffff
> f24 0xffffffff
> f26 0xffffffff
> f28 0xffffffff
> f30 0xffffffff
> f32 0xffffffff
> f34 0xffffffff
> f36 0xffffffff
> f38 0xffffffff
> f40 0xffffffff
> f42 0xffffffff
> f44 0xffffffff
> f46 0xffffffff
> f48 0xffffffff
> f50 0xffffffff
> f52 0xffffffff
> f54 0xffffffff
> f56 0xffffffff
> f58 0xffffffff
> f60 0xffffffff
> f62 0xffffffff
> fsr 0
> gsr 0
> 0: undefined
>
>
> The matching lines in the sources would be:
> 0x1386f0c is in wdc_ata_bio_start (/local/pop1/bouyer/current/src/sys/dev/ata/ata_wdc.c:309).
> 304 ata_bio->r_error = chp->ch_error;
> 305 ata_bio->error = ERROR;
> 306 }
> 307 ctrldone:
> 308 drvp->state = 0;
> 309 wdc_ata_bio_done(chp, xfer);
> 310 bus_space_write_1(wdr->ctl_iot, wdr->ctl_ioh, wd_aux_ctlr, WDCTL_4BIT);
> 311 return;
> 312 }
>
> Any idea how to debug this further ?
1) Enable traptrace. It should give you a better idea of the calling sequence.
2) If you can find the end of the stack, dump out the bottom trapframe. You
might get a better idea of the machine state from that.
Eduardo