Subject: port-sgimips/15140: TLB miss in kernel mode
To: None <gnats-bugs@gnats.netbsd.org>
From: None <he@netbsd.org>
List: netbsd-bugs
Date: 01/04/2002 16:58:07
>Number:         15140
>Category:       port-sgimips
>Synopsis:       TLB miss in kernel mode
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    port-sgimips-maintainer
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Jan 04 07:59:00 PST 2002
>Closed-Date:
>Last-Modified:
>Originator:     Havard Eidnes
>Release:        NetBSD-current 20020104 (1.5ZA)
>Organization:
	UNINETT AS
>Environment:
System: NetBSD viola.urc.uninett.no 1.5ZA NetBSD 1.5ZA (VIOLA) #15: Fri Jan  4 13:29:36 CET 2002     he@viola.urc.uninett.no:/usr/src/sys/arch/sgimips/compile/VIOLA sgimips

>Description:
	During an attempt at building the world after a recent update
	of the kernel, the machine crashed with TLB miss in the
	kernel.  Here's the console log I captured before rebooting
	(yes, I now know I should have done "show reg" before the
	"trace"; next time, maybe...):

trap: TLB miss (load or instr. fetch) in kernel mode
status=0x2, cause=0x30000008, epc=0x88003218, vaddr=0xd2e4e030
pid=18457 cmd=sh usp=0x2 ksp=0xd2e4df48
Stopped in pid 18457 (sh) at    0x88003218:     lw      a0,104(sp)
db> trace
trap: TLB miss (load or instr. fetch) in kernel mode
status=0x2, cause=0x8408, epc=0x88100518, vaddr=0xd2e4e000
pid=18457 cmd=sh usp=0x2 ksp=0xd2e4dc68
Stopped in pid 18457 (sh) at    0x88100518:     lw      v0,0(a1)
db> show reg
at          0x88150000
v0                   0
v1          0x88142404
a0          0xd2e4e000
a1          0xd2e4e000
a2               0x180
a3          0xbfa00000
t0          0x881422dc
t1                   0
t2          0x88003208
t3                   0
t4          0x881582e0
t5          0x30000200
t6          0x88003218
t7          0x88165930
s0          0x880031a8
s1                 0x3
s2          0xd2e4dfc8
s3          0x88003218
s4          0x88003208
s5                0xfe
s6                 0x1
s7                0x80
t8           0x7ffffff
t9          0x300be3f4
k0                   0
k1                   0
gp          0x881582e0
sp          0xd2e4dce8
fp          0x88003184
ra          0x88100878
sr                 0x2
mdlo             0x177
mdhi                 0
bad                  0
cs                   0
pc          0x88100518
0x88100518:     lw      v0,0(a1)
db> ps
 PID             PPID       PGRP        UID S   FLAGS          COMMAND    WAIT
>How-To-Repeat:
	Not sure if it's easily repeatable; sorry.
>Fix:
	Sorry, don't know.
>Release-Note:
>Audit-Trail:
>Unformatted:
 >18457          18379       1203          0 7  0x4086               sh
  18379          18378       1203          0 3  0x4086           nbmake    wait
  18378          18348       1203          0 3  0x4086               sh    wait
  18348          18347       1203          0 3  0x4086           nbmake    wait
  18347          12575       1203          0 3  0x4086               sh    wait
  12575          12574       1203          0 3  0x4086           nbmake    wait
  12574           1451       1203          0 3  0x4086               sh    wait
  1451            1205       1203          0 3  0x4086           nbmake    wait
  1205            1203       1203          0 3  0x4086               sh    wait
  1204             259       1204          0 3  0x4086             tail  select
  1203             259       1203          0 3    0x86              csh   pause
  1090            1068       1090       1000 3  0x4086              top  select
  1068            1066       1068       1000 3  0x4086             tcsh   pause
  1066            1063       1063       1000 3  0x4184            xterm  select
  1063            1062       1063       1000 3  0x4084             tcsh   pause
  1062             153        153          0 3   0x184             sshd  select
  448              213        448       1000 3  0x4186           systat   ttyin
  275              259        275          0 4  0x5006             more
  259              241        259          0 3  0x4086              csh   pause
  241              240        241       1000 3  0x4086             tcsh   pause
  240              235        235       1000 3  0x4184            xterm  select
  235              234        235       1000 3  0x4084             tcsh   pause
  234              153        153          0 3   0x184             sshd  select
  213              205        213       1000 3  0x4086             tcsh   pause
  205              197        197       1000 3  0x4184            xterm  select
  197              195        197       1000 3  0x4084             tcsh   pause
  195              153        153          0 3   0x184             sshd  select
  165                1          1          0 3  0x4084            getty nanosle
  164                1          1          0 3  0x4084            getty nanosle
  163                1        163          0 3  0x4086            getty   ttyin
  161                1        161          0 3    0x84             cron nanosle
  158                1        158          0 3    0x84            inetd   pause
  153                1        153          0 3    0x84             sshd  select
  133                1        133          0 3    0x84             ntpd   pause
  91                 1         91          0 3    0x84        mount_mfs  mfsidl
  71                 1         71          0 2    0x84          syslogd
  7                  0          0          0 3 0x20204         aiodoned aiodone
  6                  0          0          0 3 0x20204          ioflush  syncer
  5                  0          0          0 3 0x20204           reaper  reaper
  4                  0          0          0 3 0x20204       pagedaemon pgdaemo
  3                  0          0          0 3 0x20204          wdsc1:0  sccomp
  2                  0          0          0 3 0x20204          wdsc0:0  sccomp
  1                  0          1          0 3  0x4084             init    wait
  0                 -1          0          0 3 0x20204          swapper schedul
  18458          18457       1203          0 5  0x6002           nbmake
 db> t/t 0t18457
 pid 18457 at 0xd2e4c000
 cpu_switch+0 (0,0,0,0) ra 8802a5e8 sz 0
 8802a378+270 (0,0,0,0) ra d2e4e030 sz 48
 PC 0xd2e4e030: not in kernel space
 0+d2e4e030 (0,0,0,0) ra 0 sz 0
 User-level: pid 18457
 db> t/t 0t18458
 pid 18458 not found
 db> 
 
 	Disassembly of the failing instruction shows it's location:
 
 viola: {3} gdb -q /netbsd
 (gdb) x/i 0x88003218
 0x88003218 <mips3_KernIntr+148>:        lw      $a0,104($sp)
 (gdb)
 
 	This appears to have happened sometime while "make obj" was
 	doing it's thing.
 
 	My dmesg is currently:
 
 Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002
     The NetBSD Foundation, Inc.  All rights reserved.
 Copyright (c) 1982, 1986, 1989, 1991, 1993
     The Regents of the University of California.  All rights reserved.
 
 NetBSD 1.5ZA (VIOLA) #15: Fri Jan  4 13:29:36 CET 2002
     he@viola.urc.uninett.no:/usr/src/sys/arch/sgimips/compile/VIOLA
 256 MB memory, 234 MB free, 768 KB for ARCS, 13208 KB in 3302 buffers
 mainbus0 (root): SGI-IP22 [SGI, 690a6af3], 1 processor
 cpu0 at mainbus0: MIPS R4400 CPU (0x460) Rev. 6.0 with MIPS R4010 FPC Rev. 0.0
 cpu0: 16KB/16B direct-mapped L1 Instruction cache, 48 TLB entries
 cpu0: 16KB/16B direct-mapped write-back L1 Data cache
 cpu0: 2048KB/128B direct-mapped write-back L2 Unified cache
 imc0 at mainbus0 addr 0x1fa00000
 imc0: Revision 3
 gio0 at imc0
 hpc0 at gio0 addr 0x1fb80000: SGI HPC3
 zsc0 at hpc0 offset 0x59830
 zstty0 at zsc0 channel 1 (console i/o)
 zstty1 at zsc0 channel 0
 sq0 at hpc0 offset 0x54000: SGI Seeq 80c03
 sq0: Ethernet address 08:00:69:0a:6a:f3
 wdsc0 at hpc0 offset 0x44000: WD33C93B SCSI, rev=0, target 7
 scsibus0 at wdsc0: 8 targets, 8 luns per target
 wdsc1 at hpc0 offset 0x4c000: WD33C93B SCSI, rev=0, target 7
 scsibus1 at wdsc1: 8 targets, 8 luns per target
 dsclock0 at hpc0 offset 0x60000
 biomask 07 netmask 07 ttymask 0f clockmask bf
 scsibus0: waiting 5 seconds for devices to settle...
 sd0 at scsibus0 target 1 lun 0: <SEAGATE, ST15150N, 8607> SCSI2 0/direct fixed
 sd0: 4095 MB, 3712 cyl, 21 head, 107 sec, 512 bytes/sect x 8388315 sectors
 sd0: sync (200.0ns offset 12), 8-bit (5.000MB/s) transfers
 sd1 at scsibus0 target 2 lun 0: <SEAGATE, ST318404LC, 0006> SCSI3 0/direct fixed
 sd1: 17501 MB, 14384 cyl, 6 head, 415 sec, 512 bytes/sect x 35843670 sectors
 sd1: sync (200.0ns offset 12), 8-bit (5.000MB/s) transfers, tagged queueing
 scsibus1: waiting 5 seconds for devices to settle...
 boot device: <unknown>
 root device: sd1
 dump device (default sd1b): 
 file system (default generic): 
 root on sd1a dumps on sd1b
 mountroot: trying cd9660...
 mountroot: trying nfs...
 mountroot: trying ffs...
 readclock: 2002/1/4/13/24/15
 root file system type: ffs
 init: copying out path `/sbin/init' 11
 setclock: 2002/1/4/14/2/2