Subject: Re: Panic in simple_lock_switchcheck
To: None <current-users@netbsd.org>
From: Sverre Froyen <sverre@viewmark.com>
List: current-users
Date: 05/25/2007 14:19:39
On Monday 21 May 2007, you wrote:
> Hi,
>
> It looks like a new locking issue has been introduced some time after 22
> March. I use bogofilter to detect spam and with recent kernels I get
> panics when marking emails as spam or ham (running bogofilter -s and
> bogofilter -n). After reboot, the bogofilter database is always corrupted.
> The database resides on an LFS file system.
I retrieved various versions of common and sys using cvs update with the -D
option. For each version, I build GENERIC_LAPTOP + LOCKDEBUG kernel
(computer is i386, single processor). I then copied a saved known bad
version of the bogofilter database file to its default location, rebooted,
and ran "bogofilter -n < <mail message file>". The reboot was necessary to
get consistent results. Userland is from 22 March.
Here's what I find.
Kernels before and including 2007-04-16 do not panic.
The 2007-04-17 kernel panics with a different message and causes major file
system corruption (had to run fsck manually on the LFS partition). I discount
this test since there were a flurry of LFS related commits that day.
The 2007-04-18 kernel paniced initially, then, after the manual fsck, did not
panic until this morning when I had a vnlock/tstile deadlock. Now, after the
deadlock, it again panics consistently.
Kernels after and including 2007-04-19 panic consistently.
The panic messages (except for 2007-04-17) are:
switching with held simple_lock 0x... CPU 0 lfs_vnops.c: 1746
_prop_dictionary_keysym32_pool(...) at 0x...
Bad frame pointer: 0x...
Ideas anyone? I can easily get more information since this is perfectly
reproducible.
Thanks,
Sverre