Subject: port-i386/10313: gdb can't trace through trap()
To: None <gnats-bugs@gnats.netbsd.org>
From: John Hawkinson <jhawk@mit.edu>
List: netbsd-bugs
Date: 06/07/2000 15:01:12
>Number: 10313
>Category: port-i386
>Synopsis: gdb can't trace through trap()
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: port-i386-maintainer
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Jun 07 15:02:00 PDT 2000
>Closed-Date:
>Last-Modified:
>Originator: John Hawkinson
>Release: NetBSD 1.4.2
>Organization:
MIT
>Environment:
System: NetBSD zorkmid.mit.edu 1.4ZA NetBSD 1.4ZA (ZORKMID-$Revision: 1.13 $) #180: Wed Jun 7 16:31:23 EDT 2000 jhawk@zorkmid.mit.edu:/usr/local/current-src/sys/arch/i386/compile/ZORKMID i386
>Description:
gdb (a) can't seem to print out kernel stack traces through
trap() calls. (b) It's frame manipulation/selection seems to
broken as well, as "backtrace" only seems to give the original
traceback.
>How-To-Repeat:
Crash your kernel and get a dump. Then:
zorkmid# gdb netbsd.29
GNU gdb 4.17
Copyright 1998 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i386--netbsd"...(no debugging symbols found)...
(gdb) target kcore netbsd.29.core
panic: trap
#0 0x100 in ?? ()
(gdb) where
#0 0x100 in ?? ()
#1 0xc03049cb in cpu_reboot ()
#2 0xc01bc97d in panic ()
#3 0xc030dd2d in trap ()
(gdb)
This is almost totally useless. Of course, we can steal
the trace from the message buffer now that it's there:
(gdb) printf "%s\n",msgbufp+*(msgbufp+4)-1000
y0: screen 1 added (80x25, vt100 emulation)
wsdisplay0: screen 2 added (80x25, vt100 emulation)
wsdisplay0: screen 3 added (80x25, vt100 emulation)
wsdisplay0: screen 4 added (80x25, vt100 emulation)
wsmux1: connecting to wsdisplay0
uvm_fault(0xc04fd728, 0xc57d9000, 0, 1) -> 1
fatal page fault in supervisor mode
trap type 6 code 0 eip c0457d9a cs 8 eflags 10202 cr2 c57d9000 cpl 0
panic: trap
Begin traceback...
_trap() at _trap+0x1e5
--- trap (number 6) ---
_pcmcia_scan_cis(c0746200,c04594fc,c5633ed0,ffffffff,0) at _pcmcia_scan_cis+0x1a6
_pcmcia_read_cis(c0746200,c0746aac,c072f380,c072f380,ffffffff) at _pcmcia_read_cis+0x9c
_pcmcia_card_attach(c0746200) at _pcmcia_card_attach+0x27
_cardslot_event_thread(c072f380) at _cardslot_event_thread+0x1e9
End traceback...
syncing disks... 1 1 done
dumping to dev 0,1 offset 396196
dump 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1
(gdb)
But really, that's just too sick. That was (a). As for (b),
well, we presume that the problem is that trap() stores its base
pointer differently from a normal frame. So we should be able to pull
it out and hand it to gdb and have gdb just do the right thing. So if
we go to the trap() frame:
(gdb) where
#0 0x100 in ?? ()
#1 0xc03049cb in cpu_reboot ()
#2 0xc01bc97d in panic ()
#3 0xc030dd2d in trap ()
(gdb) frame 3
#3 0xc030dd2d in trap ()
(gdb) info frame
Stack level 3, frame at 0xc5633c2c:
eip = 0xc030dd2d in trap; saved eip 0xc0100f29
caller of frame at 0xc5633bec
Arglist at 0xc5633c2c, args:
Locals at 0xc5633c2c, Previous frame's sp is 0x0
Saved registers:
ebx at 0xc5633bf8, ebp at 0xc5633c2c, esi at 0xc5633bfc, edi at 0xc5633c00,
eip at 0xc5633c30
(gdb)
Incidently, why does "ebp" match up with the location of the
frame? Shouldn't ebp be the pointer to the calling frame, not the
current one? It's like this for frame 2 as well, so it's not just
a fluke with the trap frame. According to 'struct trapframe':
86 struct trapframe {
87 int tf_es;
88 int tf_ds;
89 int tf_edi;
90 int tf_esi;
91 int tf_ebp;
92 int tf_ebx;
93 int tf_edx;
94 int tf_ecx;
95 int tf_eax;
96 int tf_trapno;
97 /* below portion defined in 386 hardware */
98 int tf_err;
99 int tf_eip;
...
and a regular frame would be:
68 struct i386_frame {
69 struct i386_frame *f_frame;
70 int f_retaddr;
71 int f_arg0;
We should get the called frame as fp+(4*4):
(gdb) x/x 0xc5633c2c+(4*4)
0xc5633c3c: 0xc5633f40
(gdb) frame 0xc5633f40
#0 0x0 in ?? ()
(gdb) where
#0 0x100 in ?? ()
#1 0xc03049cb in cpu_reboot ()
#2 0xc01bc97d in panic ()
#3 0xc030dd2d in trap ()
(gdb) info frame
Stack level 0, frame at 0xc5633f40:
eip = 0x0; saved eip 0xc0456797
(FRAMELESS), called by frame at 0xc5633f40
Arglist at 0xc5633f40, args:
Locals at 0xc5633f40, Previous frame's sp is 0x0
Saved registers:
ebp at 0xc5633f40, eip at 0xc5633f44
Notice that "where" completely ignored the
specified frame. "info frame" declares the current frame as level 0,
as it should when we've specified arbitrarily. I suppose perhaps
the problem is that where is trying to follow the frame up by es
and losing.
Investigating further, if we print out the frame for trap():
(gdb) x/12x 0xc5633c2c
0xc5633c2c: 0xc5633eb8 0xc0100f29 0xc0740010 0x00000010
0xc5633c3c: 0xc5633f40 0x000000cd 0xc5633eb8 0x0000000d
0xc5633c4c: 0x00000030 0xc57d8000 0x00001000 0x00000006
(gdb) info symbol 0xc0100f29
calltrap + 11 in section .text
It's clear that 0x6 is not a valid eip, so this must be
a regular frame. So why didn't gdb agree? Moving up the stack:
(gdb) x/12x 0xc5633eb8
0xc5633eb8: 0xc5633f40 0xc0457924 0xc0746200 0xc04594fc
0xc5633ec8: 0xc5633ed0 0xffffffff 0x00000000 0x00000000
0xc5633ed8: 0x00000000 0x00000000 0x00000000 0x00000000
(gdb) info symbol 0xc0457924
pcmcia_read_cis + 156 in section .text
Zoinks! Also not a trapframe. What gives? If everything is a regular
frame, then this should all Just Work(tm), right?
(gdb) x/12x 0xc5633f40
0xc5633f40: 0xc5633f6c 0xc0456797 0xc0746200 0xc0746aac
0xc5633f50: 0xc072f380 0xc072f380 0xffffffff 0x00000000
0xc5633f60: 0xc5633f94 0xc0449d6b 0xc072f3c4 0xc5633f94
(gdb) info symbol 0xc0456797
pcmcia_card_attach + 39 in section .text
(gdb) info symbol 0xc5633f94
No symbol matches 0xc5633f94.
(gdb) x/12x 0xc5633f6c
0xc5633f6c: 0xc5633f94 0xc0449ed9 0xc0746200 0xc072f380
0xc5633f7c: 0xc0449cf0 0x00000000 0x00000000 0xc072f3c4
0xc5633f8c: 0x00000000 0xc07b69b0 0x00000000 0xc010034b
(gdb) info symbol 0xc0449ed9
cardslot_event_thread + 489 in section .text
(gdb) info symbol 0xc010034b
proc_trampoline + 3 in section .text
And here we come to an end:
(gdb) x/12x 0xc5633f94
0xc5633f94: 0x00000000 0xc010034b 0xc072f380 0x0000c063
0xc5633fa4: 0x00000000 0x00000000 0xc0100345 0x00000000
0xc5633fb4: 0x00000000 0x00000000 0x00000000 0x00000000
(gdb) info symbol 0xc010034b
proc_trampoline + 3 in section .text
If we assume the previous frame was a trapframe instead, we get garbage:
(gdb) x/12x 0xc0449cf0
0xc0449cf0 <cardslot_event_thread>: 0x83e58955 0x565710ec 0x08758b53 0x01f845c7
0xc0449d00 <cardslot_event_thread+16>: 0x83000000 0x0f00407e 0x00029b84 0x444e8d00
0xc0449d10 <cardslot_event_thread+32>: 0x90f44d89 0xfa80158b 0xd089c050 0xaee0050b
(gdb) info symbol 0x565710ec
No symbol matches 0x565710ec.
(gdb) info symbol 0xaee0050b
No symbol matches 0xaee0050b.
(gdb)
But I suppose this makes sense as the end of the stack,
and it is what ddb prints.
>Fix:
I suppose I can write a gdb script for printing out stack
frames traces. It would probably be useful even if gdb were fixed. I
guess I'll do up something and check it in under
/sys/arch/i386/gdbscripts/; Sigh...
>Release-Note:
>Audit-Trail:
>Unformatted: