Subject: Re: kern/29652
To: None <kern-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: David Laight <david@l8s.co.uk>
List: netbsd-bugs
Date: 08/20/2005 13:29:01
The following reply was made to PR kern/29652; it has been noted by GNATS.
From: David Laight <david@l8s.co.uk>
To: YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp>
Cc: gnats-bugs@netbsd.org
Subject: Re: kern/29652
Date: Sat, 20 Aug 2005 14:27:48 +0100
On Thu, Aug 18, 2005 at 05:29:10PM +0900, YAMAMOTO Takashi wrote:
> > panic: kernel diagnostic assertion "p->p_nrlwps == 0" failed: file "/usr/src/sys/kern/kern_exit.c", line 781
>
> as p_nrlwps currently has no locking afaik, the panic is not surprising. :)
>
> proc.h claims it's protected by p_lock, but i think
> sched_lock is more straightforward lock to use.
Last time I looked (and I've not seem anything that might affect it) the
locking of the parent/child heirarchy (and lwp one) wasn't correct.
I thought about correcting it - and may have made a few changes - but
it is basically a 'lost cause'. The killer activity is the 're-parenting'
that goes on when a process is run under gdb.
Even a process exiting when the parent has set SA_NOCLDWAIT causes grief.
Possibly fixable with a carefully constructed lock hierarchy.
(linux leaves a zombie lurking until the parent could take the signal)
Using a single lock (instead of the per-process p_lock) would make it
possible to avoid lock hierarchy violations, at only a small cost on
systems with many cpus (where contention for the lock might be a problem,
or the lock itself become a 'memory hotspot') - neither of which is
likely to be a problem until after the BIG LOCK is dead and buried.
David
--
David Laight: david@l8s.co.uk