tech-kern archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: Vnode API change: add global vnode cache
On 26 Apr 2014, at 21:22, David Holland <dholland-tech%netbsd.org@localhost>
wrote:
> On Tue, Apr 15, 2014 at 04:11:58PM +0000, Taylor R Campbell wrote:
>> New diff at http://www.netbsd.org/~hannken/vnode-pass6-3.diff
>>
>> Plan to commit early wednesday ...
>>
>> I still don't think this approach is right. It makes a long-term copy
>> of logic in getnewvnode (because this vcache(9) will not be applicable
>> everywhere), there's no reason to use kpause or any new condvars when
>> we already have one inside each vnode which we'll be using anyway in
>> vget, and it still increases the overhead of every vnode using it.
>>
>> I don't object to the principle of VFS_LOAD_NODE, or VOP_LOAD, but at
>> the moment I think it will cause greater divergence between file
>> systems and maintenance headaches as a result.
>>
>> As an alternative, I propose the set of patches at
>> <https://www.NetBSD.org/~riastradh/tmp/20140415/>, to do the
>> following:
>> [...]
>
> Wading into this after the fact, I see the following points:
>
> - Duplicating the getnewvnode logic is definitely bad; we have enough
> cruft without adding new redundant but not quite equivalent code
> paths.
>
> - It seems to me that in the long run, all of this baloney should be
> hidden away inside the vfs layer; filesystems that use the vnode cache
> should only need to call vcache_get, and the only thing that should
> ever see a partially initialized vnode is VOP_LOAD and nothing outside
> the vnode cache should ever see a vnode with refcount zero.
Agreed.
> This in
> particular means that all the various forms of vget() should be hidden
> away, as should getnewvnode. It seems to me like both of these patch
> sets are steps in this direction, but we're far enough away from the
> destination that it's hard to choose between them.
>
> - A while back (one or two rounds ago) I floated an idea that had
> separate vnode key objects for the same reason they appear here: to
> make insertion atomic. Riastradh convinced me at the time that this
> wasn't necessary, so I dropped it and never got around to posting it
> on tech-kern. However, it had a characteristic that I don't see here:
> locks in the vnode keys. The idea of that was to introduce a level of
> indirection so that nothing like VI_CHANGING was needed: while loading
> a vnode, you unlock the table but leave the key locked, and that takes
> care, in an orderly way, of making sure nobody else sees it. If we're
> going to end up with vnode key objects, I think they should contain
> locks in favor of having exposed or semi-exposed state flags.
While I agree on atomic insertions being a mandatory property I
don't see the need for a per key lock. As collisions are very rare
wait and retry using hash table lock and key->vnode being NULL
or a per table cv like I did before is sufficient (at least for now).
> - There are still filesystems that do not use the vnode cache. Half
> the problem with the legacy scheme we have is that it tries to provide
> the expire half of the cache to all filesystems, even those that don't
> use or need the lookup half of the cache. This is fundamentally wrong.
> I think rather than trying to continue with this folly we should do
> something different that keeps these filesystems separate.
>
> - Question 1: How much overhead would it really cause if we didn't
> bother to cycle tmpfs vnodes "in" and "out" but just materialized them
> along with the tmpfs_node and kept them around until someone destroys
> the tmpfs_node? vnodes have a fair amount of slop in them, but they
> aren't that big and if you're using tmpfs you presumably have enough
> memory to keep whole files in RAM anyway. In this case we could just
> provide a constructor and destructor for non-cache vnodes, decouple
> that from the lifecycle logic (and firewall it off with plenty of
> assertions and such) and a lot of the silly goes away.
Once we reach a point where this decision has to be made it may be
better to use the vnode cache for tmpfs than to make tmpfs the only
fs being different.
> - Question 2: The other obvious candidates for not using the vnode
> cache are virtual constructs like kernfs and procfs. ISTM that
> "loading" and "unloading" their vnodes is equally pointless and that
> the same solution applies. The question is: are there any other fses
> that wouldn't use the vnode cache for some other reason, such that
> this reasoning wouldn't apply? I can't think of any, and it seems like
> anything where there are genuine load and unload operations for vnodes
> vnodes should be using the cache, but that doesn't prove anything.
Kernfs and procfs already use a hash table and are candidates for
the vnode cache.
> Also, it should be vcache_get(), not intern; with all "intern"
> operations I can think of, you pass in the object to be interned,
> not just a key for it. If we had vcache_intern(), the model would be
> that you consed up a vnode, then passed it (along with the key) to
> vcache_intern(), which would then either insert it into the cache and
> return it, or destroy it and return the existing equivalent one. This
> is not really what we want.
Agreed. And we may need vcache_lookup to lookup but not load a
node too.
> and FWIW, I favor VOP_LOAD over VFS_LOADVNODE.
I prefer a fs operation over a vnode operation to initialise a
vnode. Using a vnode operation we had to either pass the vnode
operations vector to vache_get() which is ugly or we had to set
it from "*mp->mnt_op->vfs_opv_descs[0]->opv_desc_vector_p" before
calling VOP_LOAD(). To me it looks better to have the fs setup the
operations vector (which may not be the default one).
Could we finally agree on this interface:
vcache_get(mp, key, key_len, vpp) to lookup and possibly load a vnode.
vcache_lookup(mp, key, key_len, vpp) to lookup a vnode.
vcache_remove(mp, key, key_len) to remove a vnode from the cache.
VFS_LOAD_NODE(mp, vp, key, key_len, new_key) to initialise a vnode.
--
J. Hannken-Illjes - hannken%eis.cs.tu-bs.de@localhost - TU Braunschweig
(Germany)
Home |
Main Index |
Thread Index |
Old Index