NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: kern/46224: fatal page fault, kernfs_readdir()
The following reply was made to PR kern/46224; it has been noted by GNATS.
From: Greg Oster <oster%cs.usask.ca@localhost>
To: gnats-bugs%NetBSD.org@localhost
Cc:
Subject: Re: kern/46224: fatal page fault, kernfs_readdir()
Date: Mon, 19 Mar 2012 08:53:57 -0600
On Mon, 19 Mar 2012 02:30:01 +0000 (UTC)
Petar Bogdanovic <petar%smokva.net@localhost> wrote:
> >Number: 46224
> >Category: kern
> >Synopsis: fatal page fault, kernfs_readdir()
> >Confidential: no
> >Severity: critical
> >Priority: medium
> >Responsible: kern-bug-people
> >State: open
> >Class: sw-bug
> >Submitter-Id: net
> >Arrival-Date: Mon Mar 19 02:30:01 +0000 2012
> >Originator: Petar Bogdanovic
> >Release: NetBSD 6.0_BETA (16.03.2012)
> >Organization:
> >Environment:
> amd64
> >Description:
> a pretty recent netbsd-6 kernel (date: 16.03., arch: amd64)
> just crashed several times. The bug seems reproducible and does not
> appear, when no kernfs is involved:
>
> $ mount
> /dev/raid0a on / type ffs (log, NFS exported, local)
> kernfs on /kern type kernfs (local)
>
> $ sudo find / -name '*,v'
> /etc/mtree/special.local,v
> (...many more lines...)
> /var/backups/boot.cfg.current,v
> uvm_fault(0xfffffe8114c4dbd0, 0x0, 1) -> e
> fatal page fault in supervisor mode
> trap type 6 code 0 rip ffffffff804f4ceb cs 8 rflags 10297
> cr2 0 cpl 0 rsp fffffe80016077a0
> kernel: page fault trap, code=0
> Stopped in pid 847.1 (find) at
> netbsd:kernfs_readdir+0x687: movq 7fb0b30e
> (%rip),%rdi
> db{1}> bt
> kernfs_readdir() at netbsd:kernfs_readdir+0x687
> VOP_READDIR() at netbsd:VOP_READDIR+0x65
> vn_readdir() at netbsd:vn_readdir+0xf6
> sys___getdents30() at netbsd:sys___getdents30+0x76
> syscall() at netbsd:syscall+0xc4
>
>
> The same situation yields a slightly different result when
> ddb.onpanic=0 and ends with what seems to be a complete
> meltdown after the core was successfully dumped:
>
> uvm_fault(0xfffffe811556ad40, 0x0, 1) -> e
> fatal page fault in supervisor mode
> trap type 6 code 0 rip ffffffff804f4ceb cs 8 rflags 10297
> cr2 0 cpl 0 rsp fffffe80015b77a0 panic: trap
> cpu1: Begin traceback...
> printf_nolog() at netbsd:printf_nolog
> startlwp() at netbsd:startlwp
> alltraps() at netbsd:alltraps+0xa2
> VOP_READDIR() at netbsd:VOP_READDIR+0x65
> vn_readdir() at netbsd:vn_readdir+0xf6
> sys___getdents30() at netbsd:sys___getdents30+0x76
> syscall() at netbsd:syscall+0xc4
> cpu1: End traceback...
>
> (..dump begins, finishes..)
>
> pmap_kenter_pa: mapping already present
> pmap_kenter_pa: mapping already present
> pmap_kenter_pa: mapping already present
>
> (..many, many more identical lines..)
> (..takes as long as the core dump..)
>
> pmap_kenter_pa: mapping already present
> pmap_kenter_pa: mapping already present
> pmap_kenter_pa: mapping already present
> succeeded
>
>
> Skipping crash dump on recursive panic
> panic: wdc_exec_command: polled command not done
> cpu1: Begin traceback...
> printf_nolog() at netbsd:printf_nolog
> wdccommand() at netbsd:wdccommand
> wd_flushcache() at netbsd:wd_flushcache+0xd7
> wd_shutdown() at netbsd:wd_shutdown+0x3e
> pmf_system_shutdown() at netbsd:pmf_system_shutdown+0x81
> cpu_reboot() at netbsd:cpu_reboot+0x2c
> vpanic() at netbsd:vpanic+0x1dd
> printf_nolog() at netbsd:printf_nolog
> startlwp() at netbsd:startlwp
> alltraps() at netbsd:alltraps+0xa2
> VOP_READDIR() at netbsd:VOP_READDIR+0x65
> vn_readdir() at netbsd:vn_readdir+0xf6
> sys___getdents30() at netbsd:sys___getdents30+0x76
> syscall() at netbsd:syscall+0xc4
> cpu1: End traceback...
> rebooting...
>
> >How-To-Repeat:
> find /kern -ls
> >Fix:
> none
I don't know if I'm seeing quite the same error, but I've been chasing
a similar issue the last few days... What I see is:
fatal breakpoint trap in supervisor
mode trap type 1 code 0 rip ffffffff80133415 cs e030 rflags 282 cr2
7f7ff7327080 cpl 0 rsp
ffffa0005b72d9a0 Stopped in pid 396.1 (find) at
netbsd:breakpoint+0x5: leave breakpoint() at netbsd:breakpoint+0x5
pool_cache_put_paddr() at netbsd:pool_cache_put_paddr+0x25
static_qc_pools() at ffffffff80661100
static_qc_pools() at ffffffff80661480
Bad frame pointer: 0xffffffff8078a7e0
ds ffff
es a14a
fs 0
gs b278
rdi 0
rsi fffffffe
rbp ffffa0005b72d9a0
rbx ffffa0005b72dad0
rdx 1000000
rcx ffffa0000456b000
rax ffffffff80d0b0c0
r8 ffffa0000456b000
r9 400
r10 2
r11 ffffa0000460308d
r12 ffffa00004603000
r13 ffffa0000739a870
r14 ffffffffffffffff
r15 ffffa00004603098
rip ffffffff80133415 breakpoint+0x5
cs e030
rflags 282
rsp ffffa0005b72d9a0
ss e02b
netbsd:breakpoint+0x5: leave
db{3}>
and I can trigger it on-demand with a: find -x / -name "ajsdf" -print
The kernel is a netbsd-6 XEN3_DOMU kernel on amd64, with DEBUG and
debug_freecheck turned on.
Later...
Greg Oster
Home |
Main Index |
Thread Index |
Old Index