Subject: NFS directory listings sometimes corrupted (truncated)
To: None <current-users@netbsd.org>
From: Mark Davies <mark@mcs.vuw.ac.nz>
List: current-users
Date: 12/15/2003 18:38:18
I have a directory that often has around 2000 to 3000 files in it that is NFS
served (v3 TCP) from a 1.6ZC i386 box to a 1.6ZF i386 box. Sometimes the
client gets into a state where a listing of the directory only shows
approximately 300 of those files and remains in that state until something
causes it to request another READDIR at which point it sees the full list
again.
eg just now the directory had 2188 files in it as an "ls | wc -l" on the
server indicated but on the client machine "ls | wc -l" returned 327.
Removing one file on the server had both machines then agreeing that there were
2187 files in the directory.
I note from a tcpdump that the full ls listing requires 8 or 9 READDIR
requests and the response to each is up to 6 packets long. I'm not sure if the
327 files equates to some meaningful subset of those responses.
I believe the 1.6ZF client post dates all the recent NFS patches. The 1.6ZC
server clearly doesn't but I'm not sure this is a server issue. As this
directory happens to be my mail inbox this behaviour is _very_ troubling. I
don't believe I saw this problem with a 1.6L server and 1.6W client but
certainly did see it with a 1.6ZC client against the current server.
Any suggestions?
mark