tech-kern archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: Replace lockmgr for vnodes
On Thu, Feb 28, 2008 at 10:44:14PM -0800, Jason Thorpe wrote:
> On Feb 28, 2008, at 10:15 PM, David Holland wrote:
> >I am still not convinced of this, for two reasons: first, I don't
> >think layers are going to work unless locks are exported and shared;
> >and second, if every fs does its own locking, it gives every fs the
> >opportunity to do it wrong, and there'll furthermore tend to be a lot
> >of cut&paste code with all the attendant problems.
>
> The reason file systems can get it so wrong is because of the nutty
> rules that we currently have.
Yes and no. What you say is perfectly true, and the mess that
currently exists should not be allowed to contine.
That said, by "get it wrong" I mean things like ufs_rename(), or
worse, msdosfs_rename(), rename being particularly difficult to get
right - things where the per-fs code gets locks in the wrong order, or
gets the wrong locks, or doesn't bother getting them at all, or drops
things on the floor when it's halfway done, or whatever other creative
lossage someone manages to invent.
If the locking is provided in fs-independent code, then it only needs
to be debugged once.
> If vnodes are never locked when descending into a vnode op, and vnodes
> are never locked when returning from a vnode op, and vnodes are
> manipulated by accessors / mutators from within vnode ops, then there
> is no need to have a vnode locking protocol at all as part of the VFS<-
> >file system interface. It simply becomes the responsibility of
> underlying file systems to lock their own data structures as necessary
> and appropriate.
Sure, except for layers. I am not convinced this will work for layers,
although I'm having a hard time coming up with a good counterexample
so far. But it certainly won't work for a layer that wants to be able
to do multiple operations on the underlying fs and make them look
atomic.
(It will not even really work for non-layered fses, because it assumes
that each vnode op is self-contained and there's never a need to group
multiple vnode calls into a single atomic operation. With a sane
implementation of O_CREAT/O_EXCL this is not true.)
--
David A. Holland
dholland%netbsd.org@localhost
Home |
Main Index |
Thread Index |
Old Index