Subject: kern/3579: "root on nfs type ?" config panics finding root
To: None <gnats-bugs@gnats.netbsd.org>
From: Jonathan Stone <jonathan@DSG.Stanford.EDU>
List: netbsd-bugs
Date: 05/05/1997 14:54:38
>Number: 3579
>Category: kern
>Synopsis: "root on nfs type ?" config panics finding root
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: kern-bug-people (Kernel Bug People)
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Mon May 5 15:05:00 1997
>Last-Modified:
>Originator: Jonathan Stone
>Organization:
>Release: NetBSD-current 1.2D as at 1997-05-02
>Environment:
System: NetBSD Whisk.DSG.Stanford.EDU 1.2D NetBSD 1.2D (DSG_4K) #13: Fri May 2 19:47:26 PDT 1997 jonathan@Cup.DSG.Stanford.EDU:/aga/n1/src/NetBSD/IP-PLUS/src/sys/arch/i386/compile/DSG_4K i386
>Description:
A `diskless' kernel, when booting off a floppy on an i386, crashes
early during boot (apparently when accessing the root filesystem).
This seems to be non-deterministic and may be due to network traffic
(e.g., ntp chimes) aimed at the MAC address of the `diskless'-booting
host.
>How-To-Repeat:
Build a kernel with a config line
config nfsnetbsd root ? type nfs
Put the resulting kernel on a floppy (e.g., as netbsd.gz, when
using the 2.0-beta bootblocks.)
Boot on a 3c595 or a de-500.
Observe the kernel panic shortly after printing messages
identifying the interfaces where it found root and swap.
>Fix:
The following works around the problem for me.
I haven't bothered following through the code and checking that the
patch is realy correct (rather than just masking a symptom.)
The printf() message should be taken out if the patch is committed.
(If it helps, I only see one such message per boot.)
*** nfs_socket.c.DIST Wed Apr 9 04:23:02 1997
--- nfs_socket.c Fri May 2 18:34:50 1997
***************
*** 663,668 ****
--- 663,679 ----
if (nam)
m_freem(nam);
+
+ /* XXX multihomed machines lose? */
+ if (mrep == 0) {
+ printf("nfs_reply: null mbuf from nfs_receive()\n");
+ #if 0
+ return (0);
+ #else
+ continue;
+ #endif
+ }
+
/*
* Get the xid and check that it is an rpc reply
*/
>Audit-Trail:
>Unformatted: