Subject: Re: bin/23725: possible quotacheck enhancement
To: None <kre@munnari.OZ.AU>
From: Greg A. Woods <woods@weird.com>
List: netbsd-bugs
Date: 12/17/2003 17:37:23
[ On Friday, December 12, 2003 at 19:11:53 (+0700), kre@munnari.OZ.AU wrote: ]
> Subject: bin/23725: possible quotacheck enhancement
>
>
> quotacheck can run very slowly (take a long time, an extremely
> long time) if run on a filesystem that happens to have a file
> allocated to a uid (or gid if using group quotas) that is large.
... unless there is a uid (or gid if using group quotas) with a value of
ULONG_MAX, in which case it'll just run forever.... :-)
> I believe it has one bug - bad things are likely if uid 2^32-1
> actually owns files (or exists in the password file). (same for
> groups) (assuming long is 32 bits
Yes it still goes in a loop. I tried fixing this but I couldn't quite
get the logic right. If I managed to get the loop to exit then I ended
up with a zero-byte quota file somehow. Instead I just added the
following hard-nosed checks to the beginning of addid(). Note on my
systems I have both UID_MAX and GID_MAX defined as (~(uid_t)0).
u_long maxid;
switch (type) {
case GRPQUOTA:
maxid = GID_MAX;
break;
default:
case USRQUOTA:
maxid = UID_MAX;
break;
}
if (id > maxid) /* only possible if sizeof(u_long) > 4 */
errx(1, "encountered impossible %s ID value: %lu", qfextension[type], id);
if (id > (maxid - 1)) /* -1 makes us loopy! */
errx(1, "encountered invalid %s ID value: %lu", qfextension[type], id);
> If someone decides to test this, please let me know how you get on.
Other than not allowing a UID/GID of ULONG_MAX, it seems to work on the
test i386 machine under netbsd-1-6 (patches applied by hand as UFS2
support and other -current frobbing has munged things beyond patch's
ability to do it automatically).
(I think repquota could also use a similar fix, though it does
eventually finish running.)
The new '-q' flag does speed things up significantly (I still have a
"-2" user and group):
# time quotacheck -v -u /mfbd
*** Checking user quotas for /dev/rraid1a (/mfbd)
86.20s real 21.48s user 27.90s system
# time quotacheck -v -q -u /mfbd
*** Checking user quotas for /dev/rraid1a (/mfbd)
32.26s real 1.04s user 1.71s system
There is another bug somewhere, perhaps in the checkfstab() subroutine
borrowed from fsck, or in the way it's used. It seems when I have two
filesystems with quotas enabled that one of them gets checked twice:
# fgrep quota /etc/fstab
/dev/raid0a /build ffs rw,nodev,nosuid,softdep,userquota 1 2
/dev/raid1a /mfbd ffs rw,nodev,nosuid,softdep,userquota 1 2
# quotacheck -v -q -a
*** Checking user quotas for /dev/rraid0a (/build)
*** Checking user quotas for /dev/rraid1a (/mfbd)
*** Checking user quotas for /dev/rraid1a (/mfbd)
/mfbd: root fixed: inodes 28 -> 3 blocks 5784 -> 192
/mfbd: woods fixed: inodes 251693 -> 364 blocks 33448720 -> 10442488
/mfbd: root fixed: inodes 3 -> 28 blocks 192 -> 5784
/mfbd: woods fixed: inodes 364 -> 251693 blocks 10442488 -> 33448720
#
If I remove the quota files and run it again it still seems to do the
second FS twice, but of course multiprocessing randomness results in
different ordering to the operations:
# quotaoff -a
# rm /*/quota.user
# quotacheck -v -q -a
*** Checking user quotas for /dev/rraid1a (/mfbd)
*** Checking user quotas for /dev/rraid0a (/build)
quotacheck: creating quota file /build/quota.user
/build: root fixed: inodes 0 -> 25 blocks 0 -> 5592
/build: woods fixed: inodes 0 -> 251329 blocks 0 -> 23006232
*** Checking user quotas for /dev/rraid1a (/mfbd)
quotacheck: creating quota file /mfbd/quota.user
/mfbd: root fixed: inodes 0 -> 3 blocks 0 -> 192
/mfbd: woods fixed: inodes 0 -> 364 blocks 0 -> 10442488
/mfbd: root fixed: inodes 3 -> 27 blocks 192 -> 5624
/mfbd: woods fixed: inodes 364 -> 251693 blocks 10442488 -> 33448720
#
The result is a bogus quota file. The first values (e.g. 364 inodes,
10442488 blocks for my uid) are correct.
Little more is revealed to my eyes by the '-d' flag:
# quotacheck -d -v -q -a
pass 1, name /dev/rraid0a
pass 1, name /dev/rraid1a
pass 2, name /dev/rraid0a
pass 2, name /dev/rraid1a
disk /dev/rraid0: /dev/rraid0a
disk /dev/rraid1: /dev/rraid1a
*** Checking user quotas for /dev/rraid0a (/build)
*** Checking user quotas for /dev/rraid1a (/mfbd)
*** Checking user quotas for /dev/rraid1a (/mfbd)
done ffs: /dev/rraid1a (/mfbd) = 0x0
/mfbd: root fixed: inodes 3 -> 28 blocks 192 -> 5784
/mfbd: woods fixed: inodes 364 -> 251693 blocks 10442488 -> 33448720
done ffs: /dev/rraid1a (/mfbd) = 0x0
done ffs: /dev/rraid0a (/build) = 0x0
At one point I also ended up with a quota recorded for a non-existant
user (not listed in the passwd db) who doesn't seem to actually own any
files, at least none that can be found:
# repquota -v -a
*** Report for user quotas on /build (/dev/raid0a)
Block limits File limits
User used soft hard grace used soft hard grace
root -- 2796 0 0 25 0 0
woods --11503116 0 0 251329 0 0
*** Report for user quotas on /mfbd (/dev/raid1a)
Block limits File limits
User used soft hard grace used soft hard grace
root -- 2892 0 0 28 0 0
woods --16724360 0 0 251693 0 0
1001 -- 48 0 0 3 0 0
# find /mfbd -user 1001 -print
#
This may also have been caused by the dueling quotacheck processes
though. Turning off quotas, removing the quota.user files, and
re-running quotacheck, "fixed" it (even with '-q'), provided I do one
filesystem at a time by hand.
I haven't noticed any problems with "fsck -p" running more than once on
any filesystem so perhaps this is only a problem in how quotacheck uses
checkfstab()? I've not yet tried fiddling with the fs_passno field to
see if making it different on each entry will help or not.
--
Greg A. Woods
+1 416 218-0098 VE3TCP RoboHack <woods@robohack.ca>
Planix, Inc. <woods@planix.com> Secrets of the Weird <woods@weird.com>