Subject: Re: tmpfs memory leak?
To: None <tech-kern@NetBSD.org>
From: David Young <dyoung@pobox.com>
List: tech-kern
Date: 10/26/2007 19:15:04
On Tue, Oct 23, 2007 at 08:47:43PM -0700, Bill Stouder-Studenmund wrote:
> On Tue, Oct 23, 2007 at 03:16:43PM -0500, David Young wrote:
> > On Mon, Oct 22, 2007 at 02:56:36PM -0500, David Young wrote:
> > > I have an 8M tmpfs filesystem whose available blocks are exhausted over
> > > time. There seems to be a memory leak, because I cannot account for more
> > > than 3M of the use with fstat(1) (for deleted files & dirs) and du(1),
> > >
> > *snip snip*
> >
> > Inactive nodes belonging to the nullfs held references to vnodes in
> > the underlying tmpfs. As long as the nullfs held the references,
> > the kernel would never call VOP_INACTIVE() and VOP_RECLAIM() on the
> > tmpfs nodes to reclaim their resources. It was possible to force the
> > kernel to VOP_RECLAIM() the inactive nullfs vnodes using 'sysctl -w
> > kern.maxvnodes=1'. The tmpfs collected its nodes as the nullfs dropped
> > its references, and df(1) indicated that the resources held by tmpfs
> > dropped to reasonable levels.
> >
> > I have attached a patch that solves the problem by calling vrecycle(9)
> > unconditionally from layer_inactive(). I ran the patch for several hours,
> > and I did not see my tmpfs balloon like it had previously. The number
> > of nodes and blocks in use have reached the steady state that I expect
> > in my application.
> >
> > Now the nullfs will not "cache" references to nodes in the underlying
> > filesystem, so nullfs may slow a bit. I think that if we can plug a
> > memory leak at the cost of performance, we should. :-)
> >
> > I would like to check this in, soon.
>
> Please don't. Well, the part where you use an explicit struct mount * is
> fine.
The explicit struct mount * looked essential to me. Here is my
interpretation; tell me if I got it wrong. The comments in layer_bypass()
say that the lower filesystem may vput() some of its vnode arguments,
so layer_bypass() needs to VREF() any vnode that it does not want to go
away when it calls the lower. So, after layer_bypass() has vput() a vnode
itself, it should not extract a struct mount * from it. I believe we only
ever got away with it, before, because nullfs was so lazy about releasing
any lower vnode, ever. After I patched layer_inactive(), my test machine
crashed shortly after boot when it dereferenced a NULL struct mount *.
> You have no quantification of the performance hit. Also, you are in a
> situation where memory issues are more important for you. We have no
> indication that all layer users are in this situation, and I believe a
> number aren't.
You're right, I don't care too much about it in my app. I am not
especially interested in measuring filesystem performance at this time.
I don't know what benchmarks the user community likes to see.
> Ok, I have half-done a patch to do this. I'll post what I come up w/
> tomorrow. I see how to fix this and get most of the way done. What I don't
> see is how to handle locking.
Tomorrow? :-)
Dave
--
David Young OJC Technologies
dyoung@ojctech.com Urbana, IL * (217) 278-3933 ext 24