Subject: kern/34110: NFS client locks system if UDP is blocked
To: None <kern-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: None <jmmv@netbsd.org>
List: netbsd-bugs
Date: 07/29/2006 09:05:01
>Number: 34110
>Category: kern
>Synopsis: NFS client locks system if UDP is blocked
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sat Jul 29 09:05:00 +0000 2006
>Originator: Julio M. Merino Vidal
>Release: NetBSD 3.99.23
>Organization:
>Environment:
System: NetBSD dawn.home.network 3.99.23 NetBSD 3.99.23 (GENERIC) #22: Fri Jul 28 14:56:33 CEST 2006 root@max.home.network:/var/obj/usr/src-current/sys/arch/i386/compile/GENERIC i386
Architecture: i386
Machine: i386
>Description:
I have a 3.0_STABLE machine serving multiple directories over NFS.
This machine is using pf(4) to filter incoming connections and it
blocks NFS UDP; i.e. it only allows NFS TCP. In order to achieve
this, it has the following rules:
pass in on $iface inet proto udp to port rpcbind keep state
pass in on $iface inet proto tcp to port rpcbind keep state
pass in on $iface inet proto udp to port 1010:1024 keep state
pass in on $iface inet proto tcp to port 1010:1024 keep state
pass in on $iface inet proto tcp to port nfs keep state
(The 1010:1024 port range is a big hack to let RPC in, but does the
trick just fine.)
Now, I did the following on a NetBSD 3.99.23 (as shown above) machine:
# cd /media
# mount max.home.network:/home/jmmv jmmv
This command gets stalled because the UDP mount cannot succeed.
I can stop it with CTRL+C, and it indeed disappears from the
system tables, or at least it seems so from top(1) and ps(1)
output. Furthermore, mount(1) does not show any changes to the
file system table.
Unfortunately, at this point the system is already in a inconsistent
state. If I do a 'ls' over /media, the command gets stalled and
cannot be killed. Trying multiple commands over /media or /media/jmmv
can make the situation worse and lock the whole system up (this only
happened once, though). By lock I mean that the VFS does not respond
to any queries so I cannot execute any new command nor log in any new
user.
The only way out of this situation is to reboot the machine. But it
cannot be cleanly rebooted because the kernel hangs during the process
at the 'unmounting file systems...' message.
Mounting those NFS shares using TCP works perfectly fine.
>How-To-Repeat:
See above.
>Fix:
Unknown. May it be that a lock is not properly released?
>Unformatted: