Subject: bin/6607: dump est could be better
To: None <gnats-bugs@gnats.netbsd.org>
From: None <bgrayson@ece.utexas.edu>
List: netbsd-bugs
Date: 12/17/1998 22:09:45
>Number: 6607
>Category: bin
>Synopsis: dump estimate of blocks could be better when doing subset of fs
>Confidential: no
>Severity: non-critical
>Priority: low
>Responsible: bin-bug-people (Utility Bug People)
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Thu Dec 17 20:20:01 1998
>Last-Modified:
>Originator: Brian Grayson
>Organization:
Parallel and Distributed Systems
Electrical and Computer Engineering
The University of Texas at Austin
>Release: Dec 17, 1998
>Environment:
<Use :.!uname -a to embed this>
>Description:
The estimating code for dump, when dumping only a subset
of a filesystem, counts most directories more times than
necessary. In most cases this effect is small, but it is
possible to create a directory tree such that the
estimate is off by an arbitrary factor (off by 10x is
shown below).
>How-To-Repeat:
# mkdir /tmp/testdir
# mkdir /tmp/testdir/d{0,1,2,3,4,5,6,7,8,9}{0,1,2,3,4,5,6,7,8,9}{0,1,2,3,4,5,6,7,8,9}
# /sbin/dump 0f /dev/null /tmp/testdir
DUMP: Dumping sub files/directories from /
...
DUMP: estimated 21059 tape blocks on 0.54 tape(s).
...
DUMP: 2037 tape blocks on 1 volume
# dump.new_a 0f /dev/null /tmp/testdir
...
DUMP: estimated 2039 tape blocks on 0.05 tape(s).
...
DUMP: 2037 tape blocks on 1 volume
(Patches a, b, and c all estimate 2039, and end up with
2037. However, I am not sure which is the best patch.
"A" is the safest, but I'm not sure if we even want to be
looking at FTS_DOT/FTS_SEEDOT things in the first place.)
>Fix:
Patch A: in mapfileino, if we've already called mapfileino on
this inode, then return immediately.
--- traverse.c.dist Thu Dec 17 21:38:54 1998
+++ traverse.c Thu Dec 17 22:00:51 1998
@@ -149,6 +149,9 @@
int mode;
struct dinode *dp;
+ /* If we've already looked at this inode, then
+ * short-circuit and return. */
+ if (TSTINO(ino, usedinomap)) return;
dp = getino(ino);
if ((mode = (dp->di_mode & IFMT)) == 0)
return;
Patch B: don't pass the SEEDOT option to the fts_open() call:
--- traverse.c.dist Thu Dec 17 21:38:54 1998
+++ traverse.c Thu Dec 17 22:01:25 1998
@@ -193,7 +193,7 @@
msg("Can't determine cwd: %s\n", strerror(errno));
dumpabort(0);
}
- if ((dirh = fts_open(dirv, FTS_PHYSICAL|FTS_SEEDOT|FTS_XDEV,
+ if ((dirh = fts_open(dirv, FTS_PHYSICAL|FTS_XDEV,
NULL)) == NULL) {
msg("fts_open failed: %s\n", strerror(errno));
dumpabort(0);
Patch C: when doing the traversal, just ignore . and ..
entries from FTS that weren't specified in the fts_open() command:
--- traverse.c.dist Thu Dec 17 21:38:54 1998
+++ traverse.c Thu Dec 17 22:02:59 1998
@@ -205,6 +205,7 @@
case FTS_NS:
msg("Can't fts_read %s: %s\n", entry->fts_path,
strerror(errno));
+ case FTS_DOT: /* Skip it. */
case FTS_DP: /* already seen dir */
continue;
}
>Audit-Trail:
>Unformatted: