Subject: Re: port-alpha/35448: memory management fault trap during heavy
To: None <port-alpha-maintainer@netbsd.org, gnats-admin@netbsd.org,>
From: Michael L. Hitch <mhitch@lightning.msu.montana.edu>
List: netbsd-bugs
Date: 01/22/2007 21:40:02
The following reply was made to PR port-alpha/35448; it has been noted by GNATS.
From: "Michael L. Hitch" <mhitch@lightning.msu.montana.edu>
To: gnats-bugs@NetBSD.org
Cc: gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: port-alpha/35448: memory management fault trap during heavy
network I/O
Date: Mon, 22 Jan 2007 14:36:25 -0700 (MST)
On Sat, 20 Jan 2007, agrier@poofygoof.com wrote:
> - the trap:
>
> CPU 0: fatal kernel trap:
>
> CPU 0 trap entry = 0x2 (memory management fault)
> CPU 0 a0 = 0xfffffe0108266000
> CPU 0 a1 = 0x1
> CPU 0 a2 = 0x0
> CPU 0 pc = 0xfffffc00007ecde0
> CPU 0 ra = 0xfffffc000035f9ac
> CPU 0 pv = 0x0
> CPU 0 curlwp = 0xfffffc000fcd2660
> CPU 0 pid = 335, comm = nfsio
...
> (gdb) list *0xfffffc00007ecde0 # pc from the trap
> 0xfffffc00007ecde0 is in in4_cksum
> (/projects/NetBSD/src/sys/netinet/in4_cksum.c:175).
A preliminary analysis seems to indicate the trap occurred where
in4_cksum is summing 16 words in an unrolled loop. If I understand the
trap registers correctly, it looks like the address causing the trap is
0xfffffe0108266000 (the contents of a0 above). Running pmap(1) against
the coredump and kernel file seems to indicate that the address is not
within the current kernel's mapped address space. The gdb backtrace
fails, so it's a little hard to figure out where it came from. I'm going
to start groveling through the stack myself to see if I can dig out the
parameters to the in4_cksum() call, and if I can follow the traceback
manually.
It might be helpful if a backtrace from ddb could be obtained (although
my recent experience with a 4.0_BETA [not 4.0_BETA2 yet] kernel was unable
to get a good backtrace on my own machine).
> - other misc foo
>
> ps won't grok the coredump:
>
> arwen$ ps -N netbsd.gdb -M /var/crash/netbsd.0.core
> ps: can't read proc credentials at 0xfffffc000ade3480: Undefined error: 0
There's an xps gdb script in src/sys/gdbscripts that is able to display
some process information (and could be extended to show more process
details).
--
Michael L. Hitch mhitch@montana.edu
Computer Consultant
Information Technology Center
Montana State University Bozeman, MT USA