Subject: Re: experiments after recent LFS changes
To: Konrad Schroder <perseant@hhhh.org>
From: Daniel Carosone <dan@geek.com.au>
List: current-users
Date: 05/03/2005 09:29:29
--E9SPMlsjsjqOlA3h
Content-Type: multipart/mixed; boundary="B7FF+pnjRCrp0ISR"
Content-Disposition: inline
--B7FF+pnjRCrp0ISR
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
On Mon, May 02, 2005 at 02:10:49PM +1000, Daniel Carosone wrote:
> moreover I can't yet reproduce *any* other problem with LFS.
I knew I'd be speaking too soon.. :)
> It's getting pretty slow for some of the writing processes at
> points, especially with both cleaners going the disk is doing a
> *lot* of reading, but it is Still Going!
Actually, it was more than slow. These processes eventually all got
stuck in lfs_rese(rve?); they could be killed easily enough, which
then allowed other processes (like an ls) waiting on vnlock to
proceed. However, any new writers quickly got stuck in the same place
- even after a full reboot and fsck_lfs. Something in this filesystem
is now damaged.
Mounting the filesystem -n, and manually starting a cleaner with -d
shows an endless loop; the messages aren't entirely clear and there
are several worrying basic-sanity type checks failing, but if I read
them correctly one of the segments (#411) seems to be marked for
coalescing, the cleaner starts a new cleaner child to do this, and
then both loop endlessly. =20
A sort | uniq -c of these messages is attached (for brevity), as is a
dumplfs -s 411. The longer form of these messages (with original
ordering) available on request, as is any other output or even the
full 10gb vnd image if needed.
If I separately start just a coalescing cleaner, it also seems to just
loop endlessly emitting the free() warning. Both fsck_lfs and dump_lfs
think the filesystem is fine, but the cleaner is unhappy with it. =20
I wondered whether running the two cleaners at once was a bad idea; I
ran a separate quick test overnight with just the default cleaner (and
to raw disk, not vnd) and a couple of postmarks. That seemed to reach
the same point (postmark processes hung in lfs_rese), but when I ^C'd
the first one, the box instantly locked up solid.
--
Dan.
--B7FF+pnjRCrp0ISR
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="dumplfs.out"
Content-Transfer-Encoding: quoted-printable
Master Superblock at 0x8:
magic 0x70162 version 0x2 size 10485760 =20
ssize 1048576 dsize 9961392 bsize 8192 =20
fsize 1024 frag 8 minfree 10 =20
inopb 8 ifpb 409 nindir 2048 =20
nseg 10239 sepb 341 cleansz 1 =20
segtabsz 31 segmask 0x0 segshift 0 =20
bmask 0x1fff bshift 13 ffmask 0x3FF =20
ffshift 10 fbmask 0x7 fbshift 3
sushift 0 fsbtodb 1 cksum 0x5e05 =20
nclean 2 dmeta 134349 minfreeseg 511 =20
roll_id 0x760e0584 interleave 0 sumsize 1024 =20
seg0addr 0 maxfilesize 0x400801018000
Superblock disk addresses:
0x8 0xffc00 0x1ff800 0x2ff400 0x3ff000 0x4fec00 =20
0x5fe800 0x6fe400 0x7fe000 0x8fdc00 =20
Checkpoint Info
freehd 91 idaddr 0x6c2f69 ifile 1 =20
uinodes 0 bfree 1371231 avail -259946 =20
nfiles 284696 lastseg 0x6c2400 nextseg 0x6c9800 =20
curseg 0x6c2c00 offset 0x6c2f6a serial 110520 =20
tstamp Mon May 2 15:56:40 2005
IFILE inode
mode o100600 nlink 1 uid 0 gid 0 size 6766592
atime Thu Jan 1 10:00:00 1970
mtime Thu Jan 1 10:00:00 1970
ctime Thu Jan 1 10:00:00 1970
inum 1
Direct Addresses
0x6c2eed 0x4a3f3a 0x13def2 0x40db5a 0x63b372 0x40cdea
0x480029 0x480031 0x480039 0x63b37a 0x4a3e99 0x69a732
0x6c2f61 0x0 0x0
IFILE contents
free_head 91
free_tail 122603
clean 2 dirty 10237
bfree 1371253 avail -259820
SEGMENT 411 (Disk Address 0x66c00)
Segment Summary Info at 0x66c00
next 0x67000 nfinfo 1 ninos 0 flags --
sumsum 0x360 datasum 0x2ad0 create Mon May 2 12:40:52 2005
roll_id 760e0584 serial 4276
Inode addresses:
FINFO for inode: 122536 version 1 nblocks 6 lastlength 8192
4610 4611 4612 4613 4614 4615
Segment Summary Info at 0x66c31
next 0x67000 nfinfo 1 ninos 0 flags --
sumsum 0x35ce datasum 0x1c70 create Mon May 2 12:40:52 2005
roll_id 760e0584 serial 4277
Inode addresses:
FINFO for inode: 122536 version 1 nblocks 8 lastlength 8192
4616 4617 4618 4619 4620 4621 4622 4623
Segment Summary Info at 0x66c72
next 0x67000 nfinfo 1 ninos 0 flags --
sumsum 0x7859 datasum 0x51e4 create Mon May 2 12:40:52 2005
roll_id 760e0584 serial 4278
Inode addresses:
FINFO for inode: 122536 version 1 nblocks 8 lastlength 8192
4624 4625 4626 4627 4628 4629 4630 4631
Segment Summary Info at 0x66cb3
next 0x67000 nfinfo 5 ninos 4 flags DC
sumsum 0x71be datasum 0x2f1b create Mon May 2 12:40:52 2005
roll_id 760e0584 serial 4279
Inode addresses: 0x66d0c {122536v1, 121959v1, 121968v1, 122005v1}
FINFO for inode: 122536 version 1 nblocks 11 lastlength 8192
4632 4633 4634 4635 4636 4637 4638 4639
-4108 -2060 -2061
FINFO for inode: 121959 version 1 nblocks 1 lastlength 2048
0
FINFO for inode: 121968 version 1 nblocks 1 lastlength 1024
0
FINFO for inode: 122005 version 1 nblocks 1 lastlength 1024
0
FINFO for inode: 122524 version 1 nblocks 0 lastlength 0
Segment Summary Info at 0x66d11
next 0x67000 nfinfo 1 ninos 0 flags --
sumsum 0x230b datasum 0xaa4 create Mon May 2 12:40:52 2005
roll_id 760e0584 serial 4280
Inode addresses:
FINFO for inode: 122524 version 1 nblocks 32 lastlength 8192
4064 4065 4066 4067 4068 4069 4070 4071
4072 4073 4074 4075 4076 4077 4078 4079
4080 4081 4082 4083 4084 4085 4086 4087
4088 4089 4090 4091 4092 4093 4094 4095
Segment Summary Info at 0x66e12
next 0x67000 nfinfo 1 ninos 0 flags --
sumsum 0xd7bd datasum 0xee32 create Mon May 2 12:40:52 2005
roll_id 760e0584 serial 4281
Inode addresses:
FINFO for inode: 122524 version 1 nblocks 61 lastlength 8192
4096 4097 4098 4099 4100 4101 4102 4103
4104 4105 4106 4107 4108 4109 4110 4111
4112 4113 4114 4115 4116 4117 4118 4119
4120 4121 4122 4123 4124 4125 4126 4127
4128 4129 4130 4131 4132 4133 4134 4135
4136 4137 4138 4139 4140 4141 4142 4143
4144 4145 4146 4147 4148 4149 4150 4151
4152 4153 4154 4155 4156
--B7FF+pnjRCrp0ISR
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="lfsclean.lose.uniq"
2 blocks_written 0
1 seg_error 1
1 seg_error 29
2 segs_cleaned 0
2 segs_empty 0
2 util_sos 0.00
30 (2 of 1850 segments available, avail = -259820, bfree = 1371253)
29 clean_segment: 0 inodes 0 indirect -> 8192 bytes + 0 = 8192 total (to save 0)
507 lfs_cleanerd in free(): warning: junk pointer, too high to make sense.
1 lfs_cleanerd[1305]: avg util: 0.00 std dev: 0.000000
1 lfs_cleanerd[1305]: util_tot 0.00
29 lfs_cleanerd[1305]: Bad magic number: 0x0 instead of 0x61561
1 lfs_cleanerd[1305]: Cleaner Running at Mon May 2 15:55:48 2005
1 lfs_cleanerd[1305]: Cleaner Running at Mon May 2 15:55:49 2005
2 lfs_cleanerd[1305]: Cleaner Running at Mon May 2 15:55:50 2005
2 lfs_cleanerd[1305]: Cleaner Running at Mon May 2 15:55:51 2005
1 lfs_cleanerd[1305]: Cleaner Running at Mon May 2 15:55:54 2005
1 lfs_cleanerd[1305]: Cleaner Running at Mon May 2 15:55:55 2005
1 lfs_cleanerd[1305]: Cleaner Running at Mon May 2 15:55:56 2005
1 lfs_cleanerd[1305]: Cleaner Running at Mon May 2 15:55:57 2005
1 lfs_cleanerd[1305]: Cleaner Running at Mon May 2 15:55:58 2005
1 lfs_cleanerd[1305]: Cleaner Running at Mon May 2 15:55:59 2005
1 lfs_cleanerd[1305]: Cleaner Running at Mon May 2 15:56:00 2005
1 lfs_cleanerd[1305]: Cleaner Running at Mon May 2 15:56:01 2005
1 lfs_cleanerd[1305]: Cleaner Running at Mon May 2 15:56:02 2005
1 lfs_cleanerd[1305]: Cleaner Running at Mon May 2 15:56:03 2005
1 lfs_cleanerd[1305]: Cleaner Running at Mon May 2 15:56:04 2005
1 lfs_cleanerd[1305]: Cleaner Running at Mon May 2 15:56:05 2005
1 lfs_cleanerd[1305]: Cleaner Running at Mon May 2 15:56:07 2005
1 lfs_cleanerd[1305]: Cleaner Running at Mon May 2 15:56:08 2005
1 lfs_cleanerd[1305]: Cleaner Running at Mon May 2 15:56:09 2005
1 lfs_cleanerd[1305]: Cleaner Running at Mon May 2 15:56:10 2005
1 lfs_cleanerd[1305]: Cleaner Running at Mon May 2 15:56:11 2005
1 lfs_cleanerd[1305]: Cleaner Running at Mon May 2 15:56:13 2005
1 lfs_cleanerd[1305]: Cleaner Running at Mon May 2 15:56:14 2005
1 lfs_cleanerd[1305]: Cleaner Running at Mon May 2 15:56:15 2005
1 lfs_cleanerd[1305]: Cleaner Running at Mon May 2 15:56:16 2005
1 lfs_cleanerd[1305]: Cleaner Running at Mon May 2 15:56:18 2005
1 lfs_cleanerd[1305]: Cleaner Running at Mon May 2 15:56:19 2005
1 lfs_cleanerd[1305]: Cleaner Running at Mon May 2 15:56:20 2005
1 lfs_cleanerd[1305]: Cleaner starting on filesystem /mnt
29 lfs_cleanerd[1305]: Cleaning segment 411 (of 1 choices)
29 lfs_cleanerd[1305]: Current usage: maxrss=0, idrss=0, isrss=0
29 lfs_cleanerd[1305]: Warning: invalid segment summary at 0x66d09
29 lfs_cleanerd[1305]: add_segment failed segment 411: Undefined error: 0
29 lfs_cleanerd[1305]: add_segment: lfs_segmapv failed for segment 411
29 lfs_cleanerd[1305]: adding segment 411: contains 1024 bytes
28 lfs_cleanerd[1305]: coalescing already in progress
1 lfs_cleanerd[1305]: lfs_cleanerd: blocks_read 0
1 lfs_cleanerd[1469]: avg util: 0.00 std dev: 0.000000
1 lfs_cleanerd[1469]: util_tot 0.00
1 lfs_cleanerd[1469]: ino 1 markv 821 blocks
1 lfs_cleanerd[1469]: ino 1 total discontinuity 127 (12402845) for 826 blocks
1 lfs_cleanerd[1469]: lfs_cleanerd: blocks_read 0
1 lfs_cleanerd[1469]: new coalescing process, pid 1469
--B7FF+pnjRCrp0ISR--
--E9SPMlsjsjqOlA3h
Content-Type: application/pgp-signature
Content-Disposition: inline
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (NetBSD)
iD8DBQFCdrfYEAVxvV4N66cRAvUWAJwPAbFLSYAeONHoQlDnV3qwRoPHigCeL1du
JfzXfYL4fQcskuVggKA0E+A=
=if6U
-----END PGP SIGNATURE-----
--E9SPMlsjsjqOlA3h--