Subject: Re: Some locking issues in subr_pool.c
To: Jason Thorpe <thorpej@shagadelic.org>
From: Chuck Silvers <chuq@chuq.com>
List: tech-kern
Date: 04/30/2005 16:45:41
hi,
yea, I've always thought the locking in the pool code was a little whacky.
your changes look reasonable to me. as for your XXXJRTs:
+ /*
+ * Destroy all caches for this pool.
+ * XXXJRT We don't lock here, because the locking order is
+ * pool_cache -> pool. Maybe it doesn't matter because the
+ * pool is being destroyed anyway. Or maybe we should panic
+ * here if the pool_caches are not already gone.
+ */
I'd say we should assert that there are no pool_caches referring to this pool.
+ /*
+ * XXXJRT Could improve this by calling
+ * pool_do_put() and freeing the entire
+ * page list at the end.
+ */
yea, that would be better.
-Chuck
On Fri, Apr 29, 2005 at 11:59:17AM -0700, Jason Thorpe wrote:
> Folks...
>
> I noticed some locking issues in subr_pool.c the other day:
>
> - Locking protocol for pr_rmpage() was hideous. Sometimes called
> with the pool locked, sometimes not. And in the "not" case, it was
> modifying fields normally protected by the lock.
>
> - Fixing the above requires always deferring freeing the pool page to
> the back-end allocator, because the pool must not be locked when the
> pool_allocator is called (lock ordering is pool_allocator -> pool;
> see pool_allocator_free()). This was a mostly mechanical change.
>
> - When doing the above mechanical change, I noticed that pool_reclaim
> () violates the pool_cache -> pool lock ordering. I didn't see a
> good way to avoid that problem, so pool_cache_reclaim() has to use a
> trylock, and bail out if it that fails. This is not a huge deal,
> since we already trylock the pool at the beginning of pool_reclaim(),
> and this is only an opportunistic memory reclamation anyway.
>
> The following patch should fix these problems. I have not even tried
> to compile it yet, but I would like to get comments on it.
>
> -- thorpej
>