Subject: Re: port-alpha/35448: memory management fault trap during heavy
To: None <port-alpha-maintainer@netbsd.org, gnats-admin@netbsd.org,>
From: Michael L. Hitch <mhitch@lightning.msu.montana.edu>
List: netbsd-bugs
Date: 01/22/2007 21:40:02
The following reply was made to PR port-alpha/35448; it has been noted by GNATS.

From: "Michael L. Hitch" <mhitch@lightning.msu.montana.edu>
To: gnats-bugs@NetBSD.org
Cc: gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: port-alpha/35448: memory management fault trap during heavy
 network I/O
Date: Mon, 22 Jan 2007 14:36:25 -0700 (MST)

 On Sat, 20 Jan 2007, agrier@poofygoof.com wrote:
 
 > - the trap:
 >
 > CPU 0: fatal kernel trap:
 >
 > CPU 0    trap entry = 0x2 (memory management fault)
 > CPU 0    a0         = 0xfffffe0108266000
 > CPU 0    a1         = 0x1
 > CPU 0    a2         = 0x0
 > CPU 0    pc         = 0xfffffc00007ecde0
 > CPU 0    ra         = 0xfffffc000035f9ac
 > CPU 0    pv         = 0x0
 > CPU 0    curlwp    = 0xfffffc000fcd2660
 > CPU 0        pid = 335, comm = nfsio
 ...
 > (gdb) list *0xfffffc00007ecde0 # pc from the trap
 > 0xfffffc00007ecde0 is in in4_cksum
 > (/projects/NetBSD/src/sys/netinet/in4_cksum.c:175).
 
    A preliminary analysis seems to indicate the trap occurred where 
 in4_cksum is summing 16 words in an unrolled loop.  If I understand the 
 trap registers correctly, it looks like the address causing the trap is 
 0xfffffe0108266000 (the contents of a0 above).  Running pmap(1) against 
 the coredump and kernel file seems to indicate that the address is not 
 within the current kernel's mapped address space.  The gdb backtrace 
 fails, so it's a little hard to figure out where it came from.  I'm going 
 to start groveling through the stack myself to see if I can dig out the 
 parameters to the in4_cksum() call, and if I can follow the traceback 
 manually.
 
    It might be helpful if a backtrace from ddb could be obtained (although 
 my recent experience with a 4.0_BETA [not 4.0_BETA2 yet] kernel was unable 
 to get a good backtrace on my own machine).
 
 > - other misc foo
 >
 > ps won't grok the coredump:
 >
 > arwen$ ps -N netbsd.gdb -M /var/crash/netbsd.0.core
 > ps: can't read proc credentials at 0xfffffc000ade3480: Undefined error: 0
 
    There's an xps gdb script in src/sys/gdbscripts that is able to display 
 some process information (and could be extended to show more process
 details).
 
 --
 Michael L. Hitch			mhitch@montana.edu
 Computer Consultant
 Information Technology Center
 Montana State University	Bozeman, MT	USA