NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: kern/42455: tstile hang with nfs
The following reply was made to PR kern/42455; it has been noted by GNATS.
From: yamt%mwd.biglobe.ne.jp@localhost (YAMAMOTO Takashi)
To: Christoph_Egger%gmx.de@localhost
Cc: gnats-bugs%NetBSD.org@localhost, kern-bug-people%netbsd.org@localhost,
gnats-admin%netbsd.org@localhost,
netbsd-bugs%netbsd.org@localhost
Subject: Re: kern/42455: tstile hang with nfs
Date: Thu, 28 Oct 2010 05:41:19 +0000 (UTC)
hi,
> On 28.10.10 06:50, YAMAMOTO Takashi wrote:
>> The following reply was made to PR kern/42455; it has been noted by GNATS.
>>
>> From: yamt%mwd.biglobe.ne.jp@localhost (YAMAMOTO Takashi)
>> To: Christoph_Egger%gmx.de@localhost
>> Cc: gnats-bugs%NetBSD.org@localhost, netbsd-bugs%netbsd.org@localhost,
>> gnats-admin%netbsd.org@localhost,
>> kern-bug-people%netbsd.org@localhost
>> Subject: Re: kern/42455: tstile hang with nfs
>> Date: Thu, 28 Oct 2010 04:48:41 +0000 (UTC)
>>
>> hi,
>>
>> >> hi,
>> >>
>> >> >> > I added some more debug lines and figured out that the macro
>> >> >> > nfsm_wcc_data() drops the mbuf chain w/o decreasing
>> >> >> > ctxt.nwc_mbufcount.
>> >> >>
>> >> >> The nfsm_wcc_data() macro calls the nfsm_postop_attr() macro.
>> >> >> The nfsm_postop_attr() macro calls nfsm_loadattrcache() function.
>> >> >> The nfsm_loadattrcache() function calls nfsm_disct() function.
>> >> >>
>> >> >> nfsm_disct() is the function in error which drops the mbuf chain.
>> >>
>> >> are you sure?
>> >
>> > yes, absolutely and reproducable.
>> >
>> >> iirc, nwc_mbufcount is about sending mbuf. otoh, nfsm_disct
>> >> is for received mbuf.
>> >
>> > nfs_writerpc *does* call nfsm_disct() through nfsm_wcc_data,
>> > nfsm_postop_attr and nfsm_loadattrcache in this order.
>> >
>> > So you are saying this should never happen?
>>
>> i'm saying i don't understand.
>>
>> nfs_writerpc sends a request to the server, using mreq and mb.
>> it's what nwc_mbufcount is used for.
>>
>> it then parses the reply from the server, using mrep and md.
>> it's what nfsm_wcc_data/nfsm_postop_attr/nfsm_loadattrcache/nfsm_disct are
>> used for.
>
> Ah, I see.
>
>> i don't understand how a problem in the latter causes the nwc_mbufcount
>> problem. the above two are somehow mixed up?
>
> nfsm_disct() creates new mbufs with m_get() and MCLAIM().
> nfs_writerpc() relies on that the ext hook is called on m_free.
>
> But nfsm_disct() does *not* use MEXTADD(), so the ext hook is empty.
> => nfs_writerpc_extfree() won't be called to decrement nwc_mbufcount
how is it a problem? nwc_mbufcount is not incremented for the mbuf
allocated by nfsm_disct.
> => nfs_writerpc() calls cv_wait() which waits forever.
it waits for the sending mbuf chain being consumed. it's a separate mbuf
chain from the one nfsm_disct works on.
if i were you, i'd look for mbuf leak in the underlying network stack
and driver. sprinkling MCLAIM might help.
YAMAMOTO Takashi
>
> Christoph
Home |
Main Index |
Thread Index |
Old Index