Subject: port-sgimips/15140: TLB miss in kernel mode
To: None <gnats-bugs@gnats.netbsd.org>
From: None <he@netbsd.org>
List: netbsd-bugs
Date: 01/04/2002 16:58:07
>Number: 15140
>Category: port-sgimips
>Synopsis: TLB miss in kernel mode
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: port-sgimips-maintainer
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Fri Jan 04 07:59:00 PST 2002
>Closed-Date:
>Last-Modified:
>Originator: Havard Eidnes
>Release: NetBSD-current 20020104 (1.5ZA)
>Organization:
UNINETT AS
>Environment:
System: NetBSD viola.urc.uninett.no 1.5ZA NetBSD 1.5ZA (VIOLA) #15: Fri Jan 4 13:29:36 CET 2002 he@viola.urc.uninett.no:/usr/src/sys/arch/sgimips/compile/VIOLA sgimips
>Description:
During an attempt at building the world after a recent update
of the kernel, the machine crashed with TLB miss in the
kernel. Here's the console log I captured before rebooting
(yes, I now know I should have done "show reg" before the
"trace"; next time, maybe...):
trap: TLB miss (load or instr. fetch) in kernel mode
status=0x2, cause=0x30000008, epc=0x88003218, vaddr=0xd2e4e030
pid=18457 cmd=sh usp=0x2 ksp=0xd2e4df48
Stopped in pid 18457 (sh) at 0x88003218: lw a0,104(sp)
db> trace
trap: TLB miss (load or instr. fetch) in kernel mode
status=0x2, cause=0x8408, epc=0x88100518, vaddr=0xd2e4e000
pid=18457 cmd=sh usp=0x2 ksp=0xd2e4dc68
Stopped in pid 18457 (sh) at 0x88100518: lw v0,0(a1)
db> show reg
at 0x88150000
v0 0
v1 0x88142404
a0 0xd2e4e000
a1 0xd2e4e000
a2 0x180
a3 0xbfa00000
t0 0x881422dc
t1 0
t2 0x88003208
t3 0
t4 0x881582e0
t5 0x30000200
t6 0x88003218
t7 0x88165930
s0 0x880031a8
s1 0x3
s2 0xd2e4dfc8
s3 0x88003218
s4 0x88003208
s5 0xfe
s6 0x1
s7 0x80
t8 0x7ffffff
t9 0x300be3f4
k0 0
k1 0
gp 0x881582e0
sp 0xd2e4dce8
fp 0x88003184
ra 0x88100878
sr 0x2
mdlo 0x177
mdhi 0
bad 0
cs 0
pc 0x88100518
0x88100518: lw v0,0(a1)
db> ps
PID PPID PGRP UID S FLAGS COMMAND WAIT
>How-To-Repeat:
Not sure if it's easily repeatable; sorry.
>Fix:
Sorry, don't know.
>Release-Note:
>Audit-Trail:
>Unformatted:
>18457 18379 1203 0 7 0x4086 sh
18379 18378 1203 0 3 0x4086 nbmake wait
18378 18348 1203 0 3 0x4086 sh wait
18348 18347 1203 0 3 0x4086 nbmake wait
18347 12575 1203 0 3 0x4086 sh wait
12575 12574 1203 0 3 0x4086 nbmake wait
12574 1451 1203 0 3 0x4086 sh wait
1451 1205 1203 0 3 0x4086 nbmake wait
1205 1203 1203 0 3 0x4086 sh wait
1204 259 1204 0 3 0x4086 tail select
1203 259 1203 0 3 0x86 csh pause
1090 1068 1090 1000 3 0x4086 top select
1068 1066 1068 1000 3 0x4086 tcsh pause
1066 1063 1063 1000 3 0x4184 xterm select
1063 1062 1063 1000 3 0x4084 tcsh pause
1062 153 153 0 3 0x184 sshd select
448 213 448 1000 3 0x4186 systat ttyin
275 259 275 0 4 0x5006 more
259 241 259 0 3 0x4086 csh pause
241 240 241 1000 3 0x4086 tcsh pause
240 235 235 1000 3 0x4184 xterm select
235 234 235 1000 3 0x4084 tcsh pause
234 153 153 0 3 0x184 sshd select
213 205 213 1000 3 0x4086 tcsh pause
205 197 197 1000 3 0x4184 xterm select
197 195 197 1000 3 0x4084 tcsh pause
195 153 153 0 3 0x184 sshd select
165 1 1 0 3 0x4084 getty nanosle
164 1 1 0 3 0x4084 getty nanosle
163 1 163 0 3 0x4086 getty ttyin
161 1 161 0 3 0x84 cron nanosle
158 1 158 0 3 0x84 inetd pause
153 1 153 0 3 0x84 sshd select
133 1 133 0 3 0x84 ntpd pause
91 1 91 0 3 0x84 mount_mfs mfsidl
71 1 71 0 2 0x84 syslogd
7 0 0 0 3 0x20204 aiodoned aiodone
6 0 0 0 3 0x20204 ioflush syncer
5 0 0 0 3 0x20204 reaper reaper
4 0 0 0 3 0x20204 pagedaemon pgdaemo
3 0 0 0 3 0x20204 wdsc1:0 sccomp
2 0 0 0 3 0x20204 wdsc0:0 sccomp
1 0 1 0 3 0x4084 init wait
0 -1 0 0 3 0x20204 swapper schedul
18458 18457 1203 0 5 0x6002 nbmake
db> t/t 0t18457
pid 18457 at 0xd2e4c000
cpu_switch+0 (0,0,0,0) ra 8802a5e8 sz 0
8802a378+270 (0,0,0,0) ra d2e4e030 sz 48
PC 0xd2e4e030: not in kernel space
0+d2e4e030 (0,0,0,0) ra 0 sz 0
User-level: pid 18457
db> t/t 0t18458
pid 18458 not found
db>
Disassembly of the failing instruction shows it's location:
viola: {3} gdb -q /netbsd
(gdb) x/i 0x88003218
0x88003218 <mips3_KernIntr+148>: lw $a0,104($sp)
(gdb)
This appears to have happened sometime while "make obj" was
doing it's thing.
My dmesg is currently:
Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002
The NetBSD Foundation, Inc. All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California. All rights reserved.
NetBSD 1.5ZA (VIOLA) #15: Fri Jan 4 13:29:36 CET 2002
he@viola.urc.uninett.no:/usr/src/sys/arch/sgimips/compile/VIOLA
256 MB memory, 234 MB free, 768 KB for ARCS, 13208 KB in 3302 buffers
mainbus0 (root): SGI-IP22 [SGI, 690a6af3], 1 processor
cpu0 at mainbus0: MIPS R4400 CPU (0x460) Rev. 6.0 with MIPS R4010 FPC Rev. 0.0
cpu0: 16KB/16B direct-mapped L1 Instruction cache, 48 TLB entries
cpu0: 16KB/16B direct-mapped write-back L1 Data cache
cpu0: 2048KB/128B direct-mapped write-back L2 Unified cache
imc0 at mainbus0 addr 0x1fa00000
imc0: Revision 3
gio0 at imc0
hpc0 at gio0 addr 0x1fb80000: SGI HPC3
zsc0 at hpc0 offset 0x59830
zstty0 at zsc0 channel 1 (console i/o)
zstty1 at zsc0 channel 0
sq0 at hpc0 offset 0x54000: SGI Seeq 80c03
sq0: Ethernet address 08:00:69:0a:6a:f3
wdsc0 at hpc0 offset 0x44000: WD33C93B SCSI, rev=0, target 7
scsibus0 at wdsc0: 8 targets, 8 luns per target
wdsc1 at hpc0 offset 0x4c000: WD33C93B SCSI, rev=0, target 7
scsibus1 at wdsc1: 8 targets, 8 luns per target
dsclock0 at hpc0 offset 0x60000
biomask 07 netmask 07 ttymask 0f clockmask bf
scsibus0: waiting 5 seconds for devices to settle...
sd0 at scsibus0 target 1 lun 0: <SEAGATE, ST15150N, 8607> SCSI2 0/direct fixed
sd0: 4095 MB, 3712 cyl, 21 head, 107 sec, 512 bytes/sect x 8388315 sectors
sd0: sync (200.0ns offset 12), 8-bit (5.000MB/s) transfers
sd1 at scsibus0 target 2 lun 0: <SEAGATE, ST318404LC, 0006> SCSI3 0/direct fixed
sd1: 17501 MB, 14384 cyl, 6 head, 415 sec, 512 bytes/sect x 35843670 sectors
sd1: sync (200.0ns offset 12), 8-bit (5.000MB/s) transfers, tagged queueing
scsibus1: waiting 5 seconds for devices to settle...
boot device: <unknown>
root device: sd1
dump device (default sd1b):
file system (default generic):
root on sd1a dumps on sd1b
mountroot: trying cd9660...
mountroot: trying nfs...
mountroot: trying ffs...
readclock: 2002/1/4/13/24/15
root file system type: ffs
init: copying out path `/sbin/init' 11
setclock: 2002/1/4/14/2/2