Subject: bin/30816: dump(8) broken for larger values of blocking
To: None <gnats-admin@netbsd.org, netbsd-bugs@netbsd.org>
From: None <blymn@baea.com.au>
List: netbsd-bugs
Date: 07/23/2005 14:21:01
>Number: 30816
>Category: bin
>Synopsis: large blocking factors cannot be used with dump
>Confidential: no
>Severity: serious
>Priority: low
>Responsible: bin-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sat Jul 23 14:21:01 +0000 2005
>Originator: Brett Lymn (Master of the Siren)
>Release: NetBSD 3.99.6
>Organization:
Brett Lymn
>Environment:
System: NetBSD siren 3.99.6 NetBSD 3.99.6 (SIREN.ACPI.MP) #10: Sun Jul 17 19:29:12 CST 2005 toor@siren:/usr/src/sys/arch/amd64/compile/SIREN.ACPI.MP amd64
Architecture: x86_64
Machine: amd64
>Description:
The b option of dump(8) may have a value of between 1 and 1000
according to the usage message from dump. If a blocksize above about
200 is used then dump misbehaves in various ways, either looping
indefinitely or quitting with a "master/slave protocol botched" whilst
pass III is being done. It seems the larger b is the more likely you
get the master/slave protocol botched message, values near 256 result
in a hang due to an infinite loop in tape.c:doslave(), for some reason
p->count is zero which causes the first for loop in doslave() to
never terminate.
>How-To-Repeat:
I was dumping a 40Gb partition to a DLT40 tape drive using a
blocksize of 512, this resulted in dump hanging during pass III of the
dump. The machine was up multi-user but the filesystem in question does
fsck clean (i.e. this problem is not due to attempting to back up a
corrupt fs)
>Fix:
The problem can be worked around by using a lower blocking size at
the expense of the tape drive not streaming, a blocksize of 128 appears to
work reliably. I had a look at the code and there is only one place that
the request count could be zero and that is in tape.c:flushtape() where it
is deliberately zeroed and a comment of "Sentinel" is next to this statement.
This "sentinel" state does not seem to be checked anywhere in the code.