Subject: Re: kern/32535: processes stuck on vnlock
To: None <kern-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: Bill Studenmund <wrstuden@netbsd.org>
List: netbsd-bugs
Date: 09/26/2006 01:35:02
The following reply was made to PR kern/32535; it has been noted by GNATS.
From: Bill Studenmund <wrstuden@netbsd.org>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/32535: processes stuck on vnlock
Date: Mon, 25 Sep 2006 18:32:15 -0700
--at6+YcpfzWZg/htY
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
Ok, here is some more analysis on the problem.
The main issue is that vrele() can lock the vnode before calling=20
VOP_INACTIVE(). Since we are calling vrele() on the parrent and we have=20
hte child locked, this locking violates the vnode locking hierarchy; you=20
can't lock a vnode's parent while holding the vnode's lock.
I see three ways to fix this:
1) Do something along the lines of what I think Chuck was talking about,=20
and create a work queue to handle destroying vnodes. The trick, though, is=
=20
that VOP_INACTIVE() isn't necessarily about destroying a vnode, it's=20
telling the file system that the vnode is going on the free list. The main=
=20
user of this information (AFAIK) is NFS, which will zap a node if the=20
silly-rename code has been triggered.
We _could_ have a special worker thread do handle the VOP_INACTIVE
calling, however this will happen every time a vnode gets put on the free
list! Even if we added a flag so that we only did this processing if
requested (i.e. some vnodes skipped calling VOP_INACTIVE()), we still have
this weird case where we have the free list we have now, and we have a
"freeing" list.
My main concern is a case where a file system uses VOP_INACTIVE as an=20
indication that it can release resources. I expect such a file system will=
=20
want VOP_INACTIVE calls, and this change will result in a performance hit.
2) We could re-work lookup() so that we don't release the directory lock=20
if we're locking the child and we instead call vput(). I think this is the=
=20
best option as it gets rid of the real problem. I'll look into it, but=20
if someone else wants to look into this, please do! I'm not sure how=20
quickly I can look at it.
3) We add a vrele2() call that takes the vnode on which we're calling=20
vrele() and another vnode that we have locked. It would process just like=
=20
vrele(), except instead of just locking the vnode, it tries to get the=20
lock. If it can, it proceeds. If it can't, it releases the other vnode=20
then blocks waiting for the first vnode's lock.
Option 3 probably is the easiest solution. But I'm not sure what I think=20
of it.
Actually, if we did this as a locking primitive (vlock_with_held()), we=20
could use it for ".." lookup too. It's the same issue as lookup on ".."...=
=20
:-)
Take care,
Bill
----- End forwarded message -----
--at6+YcpfzWZg/htY
Content-Type: application/pgp-signature
Content-Disposition: inline
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (NetBSD)
iD8DBQFFGIMfWz+3JHUci9cRAr1yAJ4lewHllTGPf12psspc8AG7Xky8UACeOOFf
0Pitq/iiaYY3ivAQ+Her2a8=
=zEFo
-----END PGP SIGNATURE-----
--at6+YcpfzWZg/htY--