Subject: Re: problems attempting remote NFS root
To: Christian Limpach <chris@pin.lu>
From: Neil Ludban <nludban@columbus.rr.com>
List: port-xen
Date: 04/11/2004 08:56:26
Christian Limpach wrote:
> Hi!
> 
> 
>>[4] boot device: xennet0
>>[4] root on xennet0
>>[4] mountroot: trying nfs...
>>[4] nfs_boot: trying static
>>[4] nfs_boot: client_addr=192.168.1.31
>>[4] nfs_boot: gateway=192.168.1.42
>>[4] nfs_boot: netmask=255.255.255.0
>>[4] nfs_boot: server=192.168.1.17
>>[4] nfs_boot: root=192.168.1.17:/export/xen31
>>
>>
>>At this point it hangs, I see only one packet arrive at the NFS server:
>>
>>13:03:22.271279 0:10:5a:15:e4:10 ff:ff:ff:ff:ff:ff 0806 60: arp who-has
>>192.168.1.31 tell 192.168.1.31
> 
> 
> This looks right, there should be a 3 second pause and then an arp request
> with the NFS server's IP address.  This is the first place where the kernel
> goes to sleep.  Could you add printf's around the sleep to see if it ever
> returns from sleep?  The sleep's in line 287 in file nfs/nfs_boot.c.
> Something like:
>         /* give the link some time to get up */
>         printf("before nfs_boot_setaddress sleep\n");
>         tsleep(nfs_boot_setaddress, PZERO, "nfsbtd", 3 * hz);
>         printf("after nfs_boot_setaddress sleep\n");
> out:
>         soclose(so);
>         return (error);
> 
> This will show if timer interrupts and context switches work.
> 
>      christian
> 
> 

It never returns from tsleep:

[1] mountroot: trying nfs...
[1] nfs_boot: trying static
[1] nfs_boot: client_addr=192.168.1.31
[1] nfs_boot: gateway=192.168.1.42
[1] nfs_boot: netmask=255.255.255.0
[1] nfs_boot: server=192.168.1.17
[1] nfs_boot: root=192.168.1.17:/export/xen31
[1] nfs_boot_setaddress: sleeping (150)


# xc_dom_control.py list
Dom  Name             Mem(kb)  CPU  State  Time(s)
0    Domain-0          100000   0    r-      412
2    NetBSD VM 31       65536   0    --     1159

(wait about 10 seconds)

# xc_dom_control.py list
Dom  Name             Mem(kb)  CPU  State  Time(s)
0    Domain-0          100000   0    r-      413
2    NetBSD VM 31       65536   0    --     1250


I also tried reducing the memory allocated to each VM (the machine
has only 128M total) with no change.  Commenting out the tsleep
at least proves that the network is functional:

[4] nfs_boot: mountd `192.168.1.17:/export/xen31', error=13
[4] no file system for xennet0
[4] cannot mount root, error = 79
[4] root device (default xennet0):


I'll get the NFS root configured and see what happens, and maybe
try MFS as well.  Do you have a recommended method for getting a
debugger attached to the kernel running in the VM?

-Neil