NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
kern/37718: RAIDframe regression after vmlocking2 merge
>Number: 37718
>Category: kern
>Synopsis: reconstruct-in-place no longer works after vmlocking2 merge
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Tue Jan 08 15:55:01 +0000 2008
>Originator: oster%netbsd.org@localhost
>Release: NetBSD 4.99.48
>Organization:
>Environment:
System: NetBSD 4.99.48 (RAIDFRAME.ddbLD) #6: Mon Jan 7 17:01:28 CST 2008
oster@quad:/u1/devel/current/src/sys/arch/i386/compile/RAIDFRAME.ddbLD
Architecture: i386
Machine: i386
>Description:
Attempt to do a reconstruct-in-place of a failed (or
non-failed) component. Watch the machine keel over as follows:
rizzo# raidctl -vR /dev/sd3f raid1
Reconstruction suvm_fault(0xc0ae75a0, 0, 1) -> 0xe
kernel: supervisor trap page fault, code=0
Stopped in pid 0.33 (system) at netbsd:turnstile_block+0x165: movl 0x10(%eb
x),%ea
db> tr
turnstile_block(0,1,ca13a330,c0a4a744,0) at netbsd:turnstile_block+0x165
mutex_vector_enter(ca13a330,1ed,0,0,10) at netbsd:mutex_vector_enter+0xf9
vget(ca13a330,10,cb45f72c,c04c050b,cb45f71c) at netbsd:vget+0x17d
cache_lookup(ca13bbf8,cb45fa7c,cb45fa90,c13d8168,0) at netbsd:cache_lookup+0xf7
ufs_lookup(cb45f814,ca13bd90,cb45f82c,c04b2226,c07de720) at
netbsd:ufs_lookup+0xcc
VOP_LOOKUP(ca13bbf8,cb45fa7c,cb45fa90,20002,ca13bd90) at netbsd:VOP_LOOKUP+0x2d
lookup(cb45fa68,20002,400,cb45fa84,0) at netbsd:lookup+0x20b
namei(cb45fa68,0,cb45fa6c,1,0) at netbsd:namei+0x145
vn_open(cb45fa68,3,0,ca13a330,ffffffff) at netbsd:vn_open+0x71
dk_lookup(c0f7c800,cb479820,cb45fcfc,1,0) at netbsd:dk_lookup+0x5a
rf_ReconstructInPlace(c0f25000,0,c13c55e0,c13c55e0,c01dd4b0) at
netbsd:rf_ReconstructInPlace+0x18d
rf_ReconstructInPlaceThread(c13c55e0,0,c01002bd,0,c01002bd) at
netbsd:rf_ReconstructInPlaceThread+0x3d
db> show reg
ds 0x10
es 0x10
fs 0x30
gs 0x10
edi 0xcb479820
esi 0xc09c688b copyright+0x43f4b
ebp 0xcb45f68c
ebx 0xfffffff0
edx 0xc0ae8fa0 turnstile_tab+0x4c0
ecx 0xcb479820
eax 0xfffffff0
eip 0xc0469375 turnstile_block+0x165
cs 0x8
eflags 0x10287
esp 0xcb45f654
ss 0x10
netbsd:turnstile_block+0x165: movl 0x10(%ebx),%eax
db>
boot with a LOCKDEBUG kernel, and attempt the same reconstruct. See
the following:
rizzo# raidctl -vR /dev/sd3f raid1
Reconstruction sMutex error: lockdebug_barrier: spin lock held
lock address : 0x00000000c0af5fe0 type : spin
shared holds : 0 exclusive: 1
shares wanted: 0 exclusive: 0
current cpu : 0 last held: 0
current lwp : 0x00000000cb519820 last held: 0x00000000cb519820
last locked : 0x00000000c04714a6 unlocked : 000000000000000000
initialized : 0x00000000c0471556
owner field : 0x0000000000010600 wait/spin: 0/1
panic: LOCKDEBUG
Stopped in pid 0.33 (system) at netbsd:breakpoint+0x1: ret
db> tr
breakpoint(c09d4b0b,c09d0e0b,c07e775c,c09d4b2d,c0b00400) at
netbsd:breakpoint+0x1
lockdebug_abort1(c09d4b2d,1,c0b003c0,c047bdd4,1) at netbsd:lockdebug_abort1+0x6b
lockdebug_barrier(c0af38a0,1,1,0,c047cb2e) at netbsd:lockdebug_barrier+0xdd
rw_vector_enter(c0af45e4,0,0,80000000,0) at netbsd:rw_vector_enter+0x1f3
vm_map_lock_read(c0af45e0,c0ad5b48,c0ad18a0,c0b003c0,c047bdd4) at
netbsd:vm_map_lock_read+0x21
uvm_fault_internal(c0af45e0,0,1,0,0) at netbsd:uvm_fault_internal+0xa2
trap() at netbsd:trap+0x6de
--- trap (number 6) ---
turnstile_block(0,1,ca15a330,c0a57744,4) at netbsd:turnstile_block+0x185
mutex_vector_enter(ca15a330,1ed,0,ca15bbf8,10) at
netbsd:mutex_vector_enter+0x159
vget(ca15a330,10,cb52f72c,c04c9e8b,cb52f71c) at netbsd:vget+0x17d
cache_lookup(ca15bbf8,cb52fa7c,cb52fa90,c047bdd4,5) at netbsd:cache_lookup+0xf7
ufs_lookup(cb52f814,ca15bd90,cb52f82c,c04bbb96,c07e8820) at
netbsd:ufs_lookup+0xcc
VOP_LOOKUP(ca15bbf8,cb52fa7c,cb52fa90,20002,ca15bd90) at netbsd:VOP_LOOKUP+0x2d
lookup(cb52fa68,20002,400,cb52fa84,c0ad5b48) at netbsd:lookup+0x20b
namei(cb52fa68,cb35c3c0,0,c047c8be,c0ad5b48) at netbsd:namei+0x145
vn_open(cb52fa68,3,0,c047bd6d,c0ad5b48) at netbsd:vn_open+0x71
dk_lookup(c1223800,ca14c400,cb52fcfc,1,6) at netbsd:dk_lookup+0x5a
rf_ReconstructInPlace(c0f3c000,0,c13bc660,c13bc660,c01e0fe0) at
netbsd:rf_ReconstructInPlace+0x1ef
rf_ReconstructInPlaceThread(c13bc660,0,c01002bd,0,c01002bd) at
netbsd:rf_ReconstructInPlaceThread+0x3d
db>
This problem does not exist in 4.99.47. That kernel on the same box
under the exact circumstances works just fine.
This problem is very repeatable on my test box. Additional
information available upon request.
>How-To-Repeat:
run 'raidctl -vR /dev/sd3f raid1' where 'sd3f' is a component
of RAID set 'raid1'.
>Fix:
PLEASE! :)
Home |
Main Index |
Thread Index |
Old Index