Subject: bin/20768: amd has sometimes problem remounting
To: None <gnats-bugs@gnats.netbsd.org>
From: Manuel Bouyer <bouyer@asim.lip6.fr>
List: netbsd-bugs
Date: 03/17/2003 12:56:58
>Number: 20768
>Category: bin
>Synopsis: amd has sometimes problem remounting
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: bin-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Mon Mar 17 03:58:01 PST 2003
>Closed-Date:
>Last-Modified:
>Originator:
>Release: NetBSD 1.6.1_RC2
>Organization:
LIP6, Universite Paris VI.
>Environment:
System: NetBSD armandeche 1.6.1_RC2 NetBSD 1.6.1_RC2 (ARMANDECHE) #5: Fri Mar 14 15:21:13 CET 2003 bouyer@folk:/local/folk1/bouyer/netbsd-1-6/src/sys/arch/alpha/compile/ARMANDECHE alpha
Architecture: alpha
Machine: alpha
Problem also seen on i386
>Description:
I use amd with the 'net' example map:
#cat /etc/amd.conf
[ global ]
dismount_interval = 900
[ /net ]
map_name = /etc/amd/net
#cat /etc/amd/net
# $NetBSD: net,v 1.2 1997/12/12 11:52:55 hubertf Exp $
#
# /net - NFS-mount directory by cd'ing into it: cd /net/host/filesystem;
# be sure to mkdir /net before using this.
#
/defaults type:=host;rhost:=${key};fs:=${autodir}/${rhost}/root
* host==${key};type:=link;fs:=/ \
host!=${key};opts:=rw,hard,intr,nodev,nosuid,noconn
This worked fine with NetBSD 1.6. With 1.6.1_RC2 I start seeing the following
problem: I cause amd to mount a remote dir. I leave it busy for some time
and then unbusy it (just doing cd /net/some/server; sleep <a few hours>; cd /
seems enouth to reproduce the problem). Then a few minutes later try to
mount it again. The access from shell fail with a EIO, and /var/log/message
prints:
Mar 17 12:34:58 armandeche amd[13622]: mountd rpc failed: RPC: Unable to receive
Mar 17 12:34:58 armandeche amd[13622]: mountd rpc failed: RPC: Unable to receive
Mar 17 12:34:58 armandeche amd[144]: Process 13622 exited with signal 13
Mar 17 12:34:58 armandeche amd[144]: mount for /net/jazz got signal 13
Waiting a bit more usually gets the mount OK.
>How-To-Repeat:
with the above amd maps. It's somewhat random, but it's usually
related to some sequence like:
cd /net/some/server
sleep <a few hours>
cd /
sleep 650
cd /net/some/server
>Fix:
unknow. What does the "Unable to receive" error mean ?
>Release-Note:
>Audit-Trail:
>Unformatted: