Subject: advice debugging slow NFS
To: None <netbsd-help@netbsd.org>
From: Chris Jones <chris@cjones.org>
List: netbsd-help
Date: 06/04/2003 17:30:17
I know a thing or two about NFS, but I'm not sure where to go to debug
this problem. I have a NetBSD 1.6 NFS server which is being used by a
number of clients running NetBSD, Solaris, and Linux. On one particular
Linux machine, I'm getting very bad performance under heavy load.
The linux machine (RedHat 7.1, 2.4.18) seems to think it's getting
timeouts from the server:
Jun 4 17:16:01 mothra kernel: nfs: server gamera not responding, still
trying
Jun 4 17:16:04 mothra kernel: nfs: server gamera OK
Jun 4 17:16:14 mothra kernel: nfs: server gamera not responding, still
trying
Jun 4 17:16:17 mothra kernel: nfs: server gamera OK
Jun 4 17:16:44 mothra kernel: nfs: server gamera not responding, still
trying
Jun 4 17:16:50 mothra kernel: nfs: server gamera OK
Jun 4 17:17:01 mothra kernel: nfs: server gamera not responding, still
trying
Jun 4 17:17:03 mothra kernel: nfs: server gamera OK
Here's nfsstat output from the client:
Client rpc stats:
calls retrans authrefrsh
4480473 20869 0
Client nfs v2:
null getattr setattr root lookup readlink
0 0% 322463 7% 99276 2% 0 0% 2799942 64% 74 0%
read wrcache write create remove rename
282297 6% 0 0% 386678 8% 149799 3% 142597 3% 27050 0%
link symlink mkdir rmdir readdir fsstat
1369 0% 1 0% 23728 0% 29016 0% 58119 1% 4 0%
...and the server:
Server Info:
RPC Counts: (9877562 calls)
null getattr setattr lookup access
0 0% 2448440 24% 162085 1% 4053079 41%
599799 6%
readlink read write create mkdir
1524 0% 754165 7% 875915 8% 261489 2%
49522 0%
symlink mknod remove rmdir rename
1011 0% 0 0% 214657 2% 69273 0%
58719 0%
link readdir readdirplus fsstat fsinfo
2810 0% 130735 1% 3139 0% 184720 1%
70 0%
pathconf commit getlease vacated evicted
20 0% 6389 0% 0 0% 0 0%
0 0%
noop
1 0%
Server Errors:
RPC errors faults
768950 0
Server Cache Stats:
inprogress idem non-idem misses
8597 171 66 9806553
Server Lease Stats:
leases maxleases getleases
0 0 0
Server Write Gathering:
writes write RPC OPs saved
828567 875915 47348 5%
Note that the server has a large number (10%) of RPC errors, and the
client has a fair number (0.5%) of retransmissions. I don't see any
significant errors on the network interfaces (from "netstat -i").
What's the next step in debugging this?
Chris
--
Chris Jones chris@cjones.org www.cjones.org