Subject: Re: bin/23725: possible quotacheck enhancement
To: NetBSD GNATS submissions and followups <gnats-bugs@gnats.netbsd.org>
From: Robert Elz <kre@munnari.OZ.AU>
List: netbsd-bugs
Date: 12/18/2003 20:12:25
Date: Wed, 17 Dec 2003 17:37:23 -0500 (EST)
From: "Greg A. Woods" <woods@weird.com>
Message-ID: <m1AWkIF-0003QZC@proven.weird.com>
| Yes it still goes in a loop.
I kind of knew that it would ...
| I tried fixing this but I couldn't quite get the logic right.
If it had been easy, I would fixed that when I was poking around.
Instead I thought about it for a few minutes, concluded "that's hard"
and left it alone (for now).
I think this is going to require using something other than uids for the
loop termination (generally I hate surplus variables!)
| If I managed to get the loop to exit then I ended
| up with a zero-byte quota file somehow.
The ftruncate() I expect (with highid == 0)
| (I think repquota could also use a similar fix, though it does
| eventually finish running.)
Maybe, I'll look at that one sometime.
| There is another bug somewhere, perhaps in the checkfstab() subroutine
| borrowed from fsck, or in the way it's used.
The fstab stuff (and borrowing code from fsck) was all added after I touched
this last (the lousy algorithm is/was my fault, but I at least have the
esxcuse that uid's were just 16 bits back when this stuff got written, and
64K times around a loop isn't all that bad...)
But I will look and see if I can see what is happening there.
| /build: root fixed: inodes 0 -> 25 blocks 0 -> 5592
| /build: woods fixed: inodes 0 -> 251329 blocks 0 -> 23006232
| *** Checking user quotas for /dev/rraid1a (/mfbd)
| quotacheck: creating quota file /mfbd/quota.user
| /mfbd: root fixed: inodes 0 -> 3 blocks 0 -> 192
| /mfbd: woods fixed: inodes 0 -> 364 blocks 0 -> 10442488
| /mfbd: root fixed: inodes 3 -> 27 blocks 192 -> 5624
| /mfbd: woods fixed: inodes 364 -> 251693 blocks 10442488 -> 33448720
| #
|
| The result is a bogus quota file. The first values (e.g. 364 inodes,
| 10442488 blocks for my uid) are correct.
Did you notice that 23006232+10442488 == 33448720
and that 251329+364 == 251693
I doubt that is a coincidence.
I kind of suspect a logic error in the parallelism code, along with
some nicely uninitialised variables (ie: presumed 0 variables).
| At one point I also ended up with a quota recorded for a non-existant
| user (not listed in the passwd db) who doesn't seem to actually own any
| files, at least none that can be found:
That one is odd - not that the files exist, that can happen, but that
it was fixed by another quotacheck. I don't suppose there was any activity
on those filesystems while all this was happening, aside from the quotachecks
of course.
Nothing in there should be attributing blocks/inodes to users that don't
own them, that's a little hard to see a rationale for.
| This may also have been caused by the dueling quotacheck processes
| though. Turning off quotas, removing the quota.user files, and
| re-running quotacheck, "fixed" it (even with '-q'), provided I do one
| filesystem at a time by hand.
Ah, "turning off", missed that in my read-ahead before, that might have
an impact, the internal kernel usage records for currently open files
might be relevant here, perhaps.
| I haven't noticed any problems with "fsck -p" running more than once on
| any filesystem so perhaps this is only a problem in how quotacheck uses
| checkfstab()?
Probably how it uses the results from there.
| I've not yet tried fiddling with the fs_passno field to
| see if making it different on each entry will help or not.
I doubt it. I suspect a correctly positioned "exit()" in the code
might make a difference though. I'll look at it in a few days.
kre