NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
kern/45093: kernel deadlock between TCP and UVM involving callouts
>Number: 45093
>Category: kern
>Synopsis: kernel deadlock between TCP and UVM involving callouts
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Tue Jun 21 16:45:00 +0000 2011
>Originator: Manuel Bouyer
>Release: NetBSD 5.1
>Organization:
>Environment:
System: NetBSD armandeche.soc.lip6.fr 5.1 NetBSD 5.1 (GENERIC) #0: Sun Nov 7
14:39:56 UTC 2010
builds%b6.netbsd.org@localhost:/home/builds/ab/netbsd-5-1-RELEASE/i386/201011061943Z-obj/home/builds/ab/netbsd-5-1-RELEASE/src/sys/arch/i386/compile/GENERIC
i386
Architecture: i386
Machine: i386
>Description:
A deadlock condition exists in the NFS server easy to reproduce
on my server here:
The NFS server closing a socket will call uvm_unloanpage()
(trough soclose->sodisconnect->sodopendfree->sodopendfreel)
with softnet_lock held. uvm_unloanpage() can then kpause();
if while the nfsd's thread is paused a network callout fires
(e.g. TCP timers), it will block trying to get softnet_lock,
and the softclock thread will go to sleep. The effect is that
the kpause will not be woken up so we have a deadlock:
the softclock thread waits for softnet_lock, and the thread holding
the softnet_lock waits to be worken up by the softclock thread.
More details and stack trace in
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html
>How-To-Repeat:
have a NFS server with some NFS activity, some local activity
(so there is contention on vnode locks and uvm_unloanpage will
have to sleep) and enough network activity to have TCP callouts
pending.
>Fix:
workaround: either disable page loaning in nfs server, or
change uvm_unloanpage() to use yield() instead of kpause()
(the later has been confirmed to work around the issue).
A longer term is to avoid long-sleeping threads with softnet_lock.
For this specific case; maybe sodopendfree can be transfered to
another thread; or the socket's lock (which is softnet_lock for
TCP sockets) can be droped before calling sodopendfree and re-locked
after.
sokva_reclaim_callback() and sokvareserve() may have the same issue,
if called with the socket locked.
Home |
Main Index |
Thread Index |
Old Index