Subject: kern/10222: panic: ifree: range
To: None <gnats-bugs@gnats.netbsd.org>
From: None <Manuel.Bouyer@asim.lip6.fr>
List: netbsd-bugs
Date: 05/29/2000 00:01:12
>Number: 10222
>Category: kern
>Synopsis: panic: ifree: range
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Mon May 29 00:02:00 PDT 2000
>Closed-Date:
>Last-Modified:
>Originator: Manuel Bouyer
>Release: -current as of May, 26
>Organization:
LIP6/ASIM
>Environment:
System: NetBSD disco 1.4Y NetBSD 1.4Y (DS20-siop) #0: Fri May 26 17:04:19 MEST 2000 bouyer@disco:/home/src/sys/arch/alpha/compile/DS20-siop alpha
disco:/home/bouyer>showmount -e
Exports list on localhost:
/users/disco1 xxx.xxx.xxx.0
disco:/home/bouyer>df -ki /users/disco1
Filesystem 1K-blocks Used Avail Capacity iused ifree %iused Mounted on
/dev/raid2e 88784755 4406538 79938979 5% 51025 6039213 0% /users/disco1
20 nfsd processes
>Description:
I tried to crash my NFS server and I succeded. I ran the following
test programs on 18 clients:
#! /bin/csh
while (1)
zcat /users/disco1/bouyer/gcc-2.95.2.tar.gz | tar xf -
rm -rf gcc-2.95.2
end
and
#! /bin/csh
while (1)
tar cf /dev/null .
end
That is, I have several machines which access the same tree; a tar xf may be
running on one while one other is running rm -rf on it. Sample output from the
commands are:
tar: Cannot add file gcc-2.95.2/gcc/config/sparc/sol2-c1.asm: No such file or directory
tar: Error exit delayed from previous errors
and
tar: Cannot add file gcc-2.95.2/texinfo/util: No such file or directory
tar: Cannot add file gcc-2.95.2/install: No such file or directory
tar: Cannot add file gcc-2.95.2/gcc/config/i386/xm-cygwin.h: No such file or directory
It ran this way for about 4 hours, with an average traffic of about 3MB/s on
the gigabit ethernet; then in paniced with:
panic: ifree: range: dev = 0x1014, ino = 1954047342, fs = /users/disco1
Stopped in nfsd at cpu_Debugger+0x4: ret zero,(ra)
db>
db> tr
cpu_Debugger() at cpu_Debugger+0x4
panic() at panic+0xec
ffs_freefile() at ffs_freefile+0x74
ffs_vfree() at ffs_vfree+0x2c
ufs_inactive() at ufs_inactive+0x140
vput() at vput+0xe4
nfsrv_readdirplus() at nfsrv_readdirplus+0x11a0
nfssvc_nfsd() at nfssvc_nfsd+0x628
sys_nfssvc() at sys_nfssvc+0x6f4
syscall() at syscall+0x1d0
XentSys() at XentSys+0x50
--- syscall (155, netbsd.sys_nfssvc) ---
--- user mode ---
I haven't done much investigations, but it appears that vput is called in
nfsrv_readdirplus() when an error occurs, or the file has gone. This looks
like a race condition.
>How-To-Repeat:
Concurrent and conflicting access to the same tree from several
clients
>Fix:
workaround: don't do that ! :)
>Release-Note:
>Audit-Trail:
>Unformatted: