Subject: Re: -current amd64 panic, _kernel_unlock: assertion failed: olocks == 1
To: Nicolas Joly <njoly@pasteur.fr>
From: Andrew Doran <ad@netbsd.org>
List: current-users
Date: 02/20/2007 16:38:05
Hi,
On Tue, Feb 20, 2007 at 03:39:30PM +0100, Nicolas Joly wrote:
> Since a few days, i'm experiencing kernel hard lockup. The problem
> arise when the Symantec (was Veritas) NetBackup server tries to backup
> my up-to-date -current NetBSD/amd64 workstation using linux 32-bits
> binaries (which worked perfectly during the last 5 monthes).
>
> When stuck, the machine does not respond to anything ... I can't even
> access the kernel debugger using the special key sequence. This is for
> GENERIC kernel +MULTIPROCESSOR +DIAGNOSTIC + LOCKDEBUG.
>
> This morning, i made some experiments with all the options to isolate
> the problem and got the most useful results while removing the
> DIAGNOSTIC option.
>
> Kernel lock error: _kernel_unlock: assertion failed: olocks == 1
>
> lock address : 0xffffffff80ce4ea0 type : spin
> shared holds : 0 exclusive: 1
> shares wanted: 0 exclusive: 183
> current cpu : 1 last held: 1
> current lwp : 0xffff80004c93a900 last held: 0xffff80004c93a900
> last locked : 0xffffffff807d6801 unlocked : 0xffffffff807d682d
> curcpu holds : 2 wanted by: 000000000000000000
>
> panic: LOCKDEBUG
> Stopped in pid 360.1 (bpcd) at netbsd:breakpoint+0x5: leave
> db{1}> mach cpu 0
> using CPU 0
> db{1}> bt
> splclock() at netbsd:splclock
> _kernel_lock() at netbsd:_kernel_lock+0x1a3
> intr_biglock_wrapper() at netbsd:intr_biglock_wrapper+0x23
> Xintr_ioapic_level11() at netbsd:Xintr_ioapic_level11+0xdb
> --- interrupt ---
> Xspllower() at netbsd:Xspllower+0xe
> x86_softintlock() at netbsd:x86_softintlock+0x13
> DDB lost frame for netbsd:Xsoftclock+0x1a, trying 0xffff80004b4eff20
> Xsoftclock() at netbsd:Xsoftclock+0x1a
> --- interrupt ---
> 0x246:
> db{1}> mach cpu 1
> using CPU 1
> db{1}> bt
> breakpoint() at netbsd:breakpoint+0x5
> cpu_Debugger() at netbsd:cpu_Debugger+0x9
> panic() at netbsd:panic+0x1bd
> lockdebug_lock_print() at netbsd:lockdebug_lock_print
> lockdebug_abort() at netbsd:lockdebug_abort+0x47
> _kernel_unlock() at netbsd:_kernel_unlock+0x126
> trap() at netbsd:trap+0x9f6
> --- trap (number -2134027712) ---
> 0xffff:
I think this one should be fixed now, sorry. I also fixed an issue with
LOCKDEBUG kernels, where lots of file system activity would eventually
provoke a panic.
Cheers,
Andrew