Subject: Re: fsck problems
To: None <current-users@sun-lamp.cs.berkeley.edu>
From: Juergen Keil <jk@tools.de>
List: current-users
Date: 07/27/1994 14:37:32
In article <Pine.3.89.9407270926.M301-0100000@newdaisy.ee.und.ac.za>
barrett@daisy.ee.und.ac.za (Alan Barrett) writes:
> I have been having trouble with fsck for several weeks now. My root
> partition is in a state that the current fsck is unable to fix, and I
> don't like that. Sometimes fsck core dumps with SIGSEGV, and sometimes
> it doesn't. On the occasions that it doesn't coredump, it leaves the
> disk in an inconsistent state, which sometimes leads to kernel panics
> (something about "mangled directory"). An old fsck from -0.9 also
> SIGSEGV's.
I've had similar problems when upgrading to current last week. The
affected filesystem still uses the old inode format. I've found two
bugs in fsck:
1. On a filesystem using the old inode format, fsck always
generates new directory entries (i.e. in lost+found) in the
new directory record format!
2. If a directory is corrupted exactly after the '..' entry (e.g. because
of 1.), fsck crashes with a segmentation violation. Reason is a
directory record with d_reclen > DIRBLKSIZE. This crashes in dirscan,
where the damaged directory record is copied into a local variable
of exactly DIRBLKSIZE bytes.
The record with d_reclen > DIRBLKSIZE is produced in 'fsck_readdir',
which can be called recursively via dofix->direrror->fileerror->
getpathname->...
fsck_readdir increments the '..' entry's d_reclen field twice by
the amount of bogus data contained in the directory block following
after the '..' entry.
I suggest the following patch, which should fix both problems.
===================================================================
RCS file: /home/cvs.kurt/NetBSD/sbin/fsck/dir.c,v
retrieving revision 1.1.1.3
diff -c -r1.1.1.3 dir.c
*** 1.1.1.3 1994/07/14 12:28:45
--- dir.c 1994/07/21 19:11:32
***************
*** 190,202 ****
--- 190,214 ----
ndp = (struct direct *)(bp->b_un.b_buf + idesc->id_loc);
if (idesc->id_loc < blksiz && idesc->id_filesize > 0 &&
dircheck(idesc, ndp) == 0) {
+ long fixed_reclen;
+
size = DIRBLKSIZ - (idesc->id_loc % DIRBLKSIZ);
idesc->id_loc += size;
idesc->id_filesize -= size;
+ fixed_reclen = dp->d_reclen + size;
fix = dofix(idesc, "DIRECTORY CORRUPTED");
bp = getdirblk(idesc->id_blkno, blksiz);
dp = (struct direct *)(bp->b_un.b_buf + dploc);
+ #if 0
dp->d_reclen += size;
+ #else
+ /*
+ * dofix above might scan over our broken directory record
+ * and fix it's size, too. The old code increments
+ * dp->d_reclen twice.
+ */
+ dp->d_reclen = fixed_reclen;
+ #endif
if (fix)
dirty(bp);
}
***************
*** 336,341 ****
--- 348,367 ----
dirp->d_reclen = newent.d_reclen;
dirp->d_namlen = newent.d_namlen;
bcopy(idesc->id_name, dirp->d_name, (size_t)dirp->d_namlen + 1);
+
+ /*
+ * 'dirscan' will eventually swap the original dirp back to the
+ * old inode format. Handle the new dirent here.
+ */
+ # if (BYTE_ORDER == LITTLE_ENDIAN)
+ if (!newinofmt) {
+ u_char tmp;
+
+ tmp = dirp->d_namlen;
+ dirp->d_namlen = dirp->d_type;
+ dirp->d_type = tmp;
+ }
+ # endif
return (ALTERED|STOP);
}
--
Juergen Keil jk@tools.de ...!{uunet,mcsun}!unido!tools!jk
------------------------------------------------------------------------------