Subject: kern/10202: kernel loops forever in nfs code
To: None <gnats-bugs@gnats.netbsd.org>
From: Antti Kantee <pooka@iki.fi>
List: netbsd-bugs
Date: 05/26/2000 10:01:13
>Number: 10202
>Category: kern
>Synopsis: kernel loops forever in nfs code
>Confidential: no
>Severity: serious
>Priority: high
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Fri May 26 10:02:01 PDT 2000
>Closed-Date:
>Last-Modified:
>Originator: Antti Kantee
>Release: -current from ~20th May 2000
>Organization:
>Environment:
System: NetBSD roboti.cs.hut.fi 1.4Y NetBSD 1.4Y (ROBOTI) #13: Tue May 23 18:27:04 EEST 2000 root@roboti.cs.hut.fi:/cvs/src/sys/arch/alpha/compile/ROBOTI alpha
>Description:
Sometimes some processes just enter an infinate loop in the kernel. Last
time my victim was cvs and now csh. nfsio threads have collect suspiciously
little processor time.
roboti# ps axl -M netbsd.5.core -N netbsd.5
UID PID PPID CPU PRI NI VSZ RSS WCHAN STAT TT TIME COMMAND
0 0 0 0 -18 0 0 0 - RLs ?? 0:00.03 (swapper)
0 1 0 2 10 0 752 0 wait Is ?? 0:00.06 (init)
0 2 0 0 -18 0 0 0 daemon DL ?? 0:00.26 (pagedaemon)
0 3 0 0 -18 0 0 0 reaper DL ?? 0:00.02 (reaper)
0 4 0 0 18 0 0 0 - RL ?? 0:00.18 (ioflush)
0 122 0 0 10 0 0 0 nfsidl IL ?? 0:00.01 (nfsio)
0 123 0 0 10 0 0 0 nfsidl IL ?? 0:00.00 (nfsio)
0 124 0 0 10 0 0 0 nfsidl IL ?? 0:00.00 (nfsio)
0 125 0 0 10 0 0 0 nfsidl IL ?? 0:00.00 (nfsio)
0 14893 0 32 60 0 648 0 - R p3- 0:00.03 (reboot)
0 14854 0 0 2 0 1048 0 - R p6- 1:33.35 (csh)
roboti# ps -ax -O paddr -M netbsd.5.core -N netbsd.5
PID PADDR TT STAT TIME COMMAND
0 59d4d8 ?? RLs 0:00.03 (swapper)
1 104a000 ?? Is 0:00.06 (init)
2 104a258 ?? DL 0:00.26 (pagedaemon)
3 104a4b0 ?? DL 0:00.02 (reaper)
4 104a708 ?? RL 0:00.18 (ioflush)
122 104bc20 ?? IL 0:00.01 (nfsio)
123 31e8008 ?? IL 0:00.00 (nfsio)
124 31e8260 ?? IL 0:00.00 (nfsio)
125 31e84b8 ?? IL 0:00.00 (nfsio)
14893 3786978 p3- R 0:00.03 (reboot)
14854 31e92c8 p6- R 1:33.35 (csh)
(gdb) proc 0xfffffc00031e92c8
(gdb) bt
#0 0xfffffc000033a954 in mi_switch () at ../../../../kern/kern_synch.c:815
#1 0xfffffc0000339cf4 in tsleep (ident=0x0, priority=24,
wmesg=0xfffffc00005146b0 "netio", timo=5120)
at ../../../../kern/kern_synch.c:432
#2 0xfffffc000035c52c in sbwait (sb=0x0)
at ../../../../kern/uipc_socket2.c:274
#3 0xfffffc000035a688 in soreceive (so=0xfffffc0001172d80,
paddr=0xfffffe0006359698, uio=0xfffffe0006359608, mp0=0x0, controlp=0x0,
flagsp=0xfffffe000635963c) at ../../../../kern/uipc_socket.c:661
#4 0xfffffc00004355f4 in nfs_receive (rep=0xfffffe00001b0600, aname=0x0,
mp=0xfffffe00063596a0) at ../../../../nfs/nfs_socket.c:646
#5 0xfffffc00004356f8 in nfs_reply (myrep=0xfffffe00001b0600)
at ../../../../nfs/nfs_socket.c:700
#6 0xfffffc00004360c8 in nfs_request (vp=0xfffffc0001a88038,
mrest=0xfffffc000293e680, procnum=16, procp=0xfffffc000293e880,
cred=0xfffffe00000f5700, mrp=0xfffffe0006359848, mdp=0xfffffe0006359850,
dposp=0xfffffe0006359858) at ../../../../nfs/nfs_socket.c:987
#7 0xfffffc000045cd70 in nfs_readdirrpc (vp=0xfffffc0001a88038,
uiop=0xfffffe00063598e8, cred=0xfffffe00000f5700)
at ../../../../nfs/nfs_vnops.c:2082
#8 0xfffffc000041578c in nfs_doio (bp=0xfffffc0000235998,
cr=0xfffffe00000f5700, p=0xfffffc00031e92c8)
at ../../../../nfs/nfs_bio.c:1054
---Type <return> to continue, or q <return> to quit---
#9 0xfffffc0000413f44 in nfs_bioread (vp=0xfffffc0001a88038,
uio=0xfffffe0006359b88, ioflag=0, cred=0xfffffe00000f5700, cflag=0)
at ../../../../nfs/nfs_bio.c:346
#10 0xfffffc000045c0b0 in nfs_readdir (v=0x0)
at ../../../../nfs/nfs_vnops.c:1961
#11 0xfffffc0000367024 in getcwd_scandir (lvpp=0xfffffe0006359db0, uvpp=0x200,
bpp=0xfffffe0006359dc0, bufp=0xfffffe00000b3000 "", p=0xfffffc00031e92c8)
at ../../../../sys/vnode_if.h:657
#12 0xfffffc0000367560 in getcwd_common (lvp=0xfffffc000383ce28,
rvp=0xfffffc0001060158, bpp=0xfffffe0006359e38,
bufp=0xfffffe00000b3000 "", limit=512, flags=104177072,
p=0xfffffc00031e92c8) at ../../../../kern/vfs_getcwd.c:472
#13 0xfffffc00003677e8 in sys___getcwd (p=0xfffffc00031e92c8,
v=0xfffffe0006359e88, retval=0xfffffe0006359ed8)
at ../../../../kern/vfs_getcwd.c:588
#14 0xfffffc00004e2bbc in syscall (code=296, framep=0xfffffe0006359ef8)
at ../../../../arch/alpha/alpha/trap.c:698
#15 0xfffffc000030046c in XentSys ()
at ../../../../arch/alpha/alpha/locore.s:589
warning: Hit heuristic-fence-post without finding
warning: enclosing function for address 0x12002d89c
>How-To-Repeat:
Dunno, there is no dead sure method of repeating this, but running the system
for some time makes the problem pop up.
>Fix:
please...
>Release-Note:
>Audit-Trail:
>Unformatted: