Subject: bin/4082: amd dumps core if a server is down
To: None <gnats-bugs@gnats.netbsd.org>
From: Matthieu Herrb <matthieu@laas.fr>
List: netbsd-bugs
Date: 09/04/1997 22:29:43
>Number: 4082
>Category: bin
>Synopsis: amd dumps core if a server is down
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: bin-bug-people (Utility Bug People)
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Thu Sep 4 18:05:01 1997
>Last-Modified:
>Originator:
>Organization:
Matthieu Herrb | e-mail: matthieu@laas.fr
CNRS/LAAS | url: <A HREF="http://www.laas.fr/~matthieu">
Toulouse, France | War, what is it good for ? Absolutely nothing !
>Release: NetBSD-current 08/20/97
>Environment:
System: NetBSD abel 1.2G NetBSD 1.2G (ABEL) #1: Wed Jul 2 13:35:44 MEST 1997 matthieu@abel:/usr/src/sys/arch/sparc/compile/ABEL sparc
>Description:
If a NFS server used by an amd map is down, amd dumps core and
thus all access to an automounted directory hangs.
>How-To-Repeat:
In the following amd map, pif-1 is down. This map is used on
/home. Try to 'cd /home/bug' and watch amd dumping a core.
/defaults opts:=resvport,nosuid,noconn
matthieu type:=nfs;rhost:=pif;rfs:=/users1;sublink:=matthieu
bug type:=nfs;rhost:=pif-1;rfs:=/users1;sublink:=bug
Here are some debugging info from gdb:
Note that the IP address for pif-1 is available, althrough it doesn't
seem initialized here:
abel# host pif-1
pif-1.laas.fr A 140.93.160.46
Program received signal SIGSEGV (11), Segmentation fault
0x9894 in prime_nfs_fhandle_cache (path=0x27906 "/users1", fs=0x26b00,
fhbuf=0xf7ffee80, wchan=0x25d00) at /usr/src/usr.sbin/amd/amd/ops_nfs.c:353
353 if (fp->fh_sin.sin_addr.s_addr != fs->fs_ip->sin_addr.s_addr) {
(gdb) p fp
$1 = (fh_cache *) 0x25e00
(gdb) p *fp
$2 = {fh_q = {q_forw = 0x25c80, q_back = 0x1d724}, fh_wchan = 0x25d00,
fh_error = -1, fh_id = 3, fh_cid = 57, fh_nfs_version = 0, fh_nfs_handle = {
v3 = {fhs_status = MNT3_OK, mountres3_u = {mountinfo = {fhandle = {
fhandle3_len = 0, fhandle3_val = 0x0}, auth_flavors = {
auth_flavors_len = 0, auth_flavors_val = 0x0}}}}, v2 = {
fhs_status = 0, fhstatus_u = {
fhs_fhandle = '\000' <repeats 31 times>}}}, fh_sin = {
sin_len = 0 '\000', sin_family = 0 '\000', sin_port = 0, sin_addr = {
s_addr = 0}, sin_zero = "\000\000\000\000\000\000\000"}, fh_fs = 0x0,
fh_path = 0x0}
(gdb) p fs
$3 = (fserver *) 0x26b00
(gdb) p *fs
$4 = {fs_q = {q_forw = 0x26980, q_back = 0x1d140}, fs_refc = 1,
fs_host = 0x27960 "pif-1.laas.fr", fs_ip = 0x0, fs_cid = 56, fs_pinger = 30,
fs_flags = 21, fs_type = 0x5ca0 "nfs", fs_version = 0,
fs_proto = 0x52d0 "udp", fs_private = 0x27980,
fs_prfree = 0x1c118 <_DYNAMIC+280>}
(gdb) p fs->fs_ip
$5 = (struct sockaddr_in *) 0x0
(gdb) quit
>Fix:
This patch seem to fix the problem and restore the correct behaviour:
--- amd/srvr_nfs.c.orig Fri Jul 25 13:26:52 1997
+++ amd/srvr_nfs.c Thu Sep 4 22:26:09 1997
@@ -721,11 +721,6 @@
}
nfs_version = best_nfs_version;
}
-
- if (!nfs_version) {
- free((voidp)ip);
- ip = 0; /* Server probably down - no ping responce */
- }
#else /* not HAVE_FS_NFS3 */
nfs_version = NFS_VERSION;
#endif /* not HAVE_FS_NFS3 */
>Audit-Trail:
>Unformatted: