Re: Reclaiming vnodes

To: Bill Stouder-Studenmund <wrstuden%netbsd.org@localhost>
Subject: Re: Reclaiming vnodes
From: Adam Hamsik <haaaad%gmail.com@localhost>
Date: Wed, 23 Sep 2009 18:18:04 +0200

Hi,
On Sep,Tuesday 22 2009, at 11:55 PM, Bill Stouder-Studenmund wrote:

On Tue, Sep 15, 2009 at 12:54:00AM +0200, Adam Hamsik wrote:

Hi,
On Sep,Monday 14 2009, at 5:55 AM, matthew green wrote:



i'm still not entirely sure what the point of this patch is.  i
understand it helps zfs, but i don't understand why or how. i'm
also curious what sort of testing you've done.  i do not believe
that testing in qemu is sufficient.  how does it affect systems
that recycle vnodes a lot, such as older systems running a build?


I do not have such system yet. What this patch does is that it uses
another thread to reclaims vnodes, this way vnodes are reclaimed in
different context than are allocated.

You REALLY need to test such a system. If you can't test one, youshouldnot commit the change. Find a system and run it hard. Something likea few

compiles going on and have a few "find" commands running in the
background (on file systems each with more files than you have max
vnodes).

I'm testing this patch on my test server with 4 cpus by building arelease with -j8 and very low maxvnodes limit and with some othervnode hungry tasks at behind.

Current:
Vnodes are allocated only if there are no vnodes on a free_list. If
there is a free vnode on a list it will be recycled which actually
means that it will call VOP_RECLAIM.

In zfs there is a problem with calling getnewvnode form zfs_zget, in
some cases getnewvnode pick vnode from a free list and call
VOP_RECLAIM. This can lead to deadlock because VOP_RECLAIM can try to
lock same mutex as was hold by zfs_zget. This can't be easily fixedif
we do not want to touch and change whole zfs locking protocol.

With Patch:

Vnodes are only allocated and there is no vnode recycling. If number
of used vnodes in a system is > than kern.vnodes_num_hiwat(inpercents
of maxvnodes) vrele thread will be woken up and it will start with
releasing of free vnodes until number of used vnodes is < than
kern.vnodes_num_lowat.
The problem with this is you're going to reduce caching just becauseyou
can't figure out how to fix the locking.
Also, it strikes me that your design doesn't scale well. Right now,eachthread that wants a vnode becomes a vnode cleaning thread. So if weneedto reclaim a lot of vnodes at once, all the waiting threads pitch into dowork. Without an easy way to dynamically scale the number ofreclaimingthreads, you introduce a resource allocation issue. Pick too fewthreads,and you can choke a busy system. Pick too many, and you waste spaceusable
by other subsystems.

But vnodes doesn't need to be reclaimed so often as allocated. I thinkthat allocation is more critical then speed of reclaim.

My suggestion on how to fix this is:

1) Get a mutex variant that always records lock owner (curlwp()) and a

mutex_enter() variant that will report EDEADLK when you try to locka lock

you have.

2) Use said mutexes in zfs and use said mutex_enter() variant in the
places that VOP_RECLAIM would hit.

3) Have VOP_RECLAIM report EDEADLK if you'd get a deadlock w/ the
operation (and clean up appropriately).

4) have getvnode() try the next vnode if it gets EDEADLK.


This seems to me as a hack.

Regards

Adam.

Follow-Ups:
- Re: Reclaiming vnodes
  - From: Bill Stouder-Studenmund

References:
- re: Reclaiming vnodes
  - From: matthew green
- Re: Reclaiming vnodes
  - From: Adam Hamsik
- Re: Reclaiming vnodes
  - From: Bill Stouder-Studenmund

Prev by Date: Re: porting udl(4) from OpenBSD
Next by Date: Re: Reclaiming vnodes
Previous by Thread: Re: Reclaiming vnodes
Next by Thread: Re: Reclaiming vnodes
Indexes:

Home | Main Index | Thread Index | Old Index