Subject: Re: kern/35224: kernel hangs in mclpl after heavy net load in the sparc64 port (eventually also other ports)
To: Martin Husemann <martin@duskware.de>
From: Stephan Pietzko <stephan.pietzko@uni-konstanz.de>
List: netbsd-bugs
Date: 12/10/2006 08:22:11
Martin Husemann <martin@duskware.de> wrote
>
> This sounds like a mbuf leak. Check netstat output, maybe you can spot where
> the mbuf are lingering.
>
> > the machine is crashing once a day
> Is this related? If not, please file a separate PR.
Nope - this is related.
> What kind of crash is it? A panic should print a message before rebooting, we
> need at least that to even start thinking about this.
Sorry, i meant: The daemon is freezing in the mclpl-status and cause
of that i have to reboot. This happens every some days or sometimes
serveral times a day. I called this 'the machine is crashing once a
day', but it is still the mclpl-problem.
I have several outputs from lsof, ps, top from this problem:
----------------------------------------------------------------------
root@nepal:/root> grep -in mclpl *.txt
050206top.txt:9: 25980 www -22 0 2224K 169M mclpl 237:43 0.00% 0.00% <thttpd>
110706mclpl_top.txt:9: 4011 www -22 0 3840K 11M mclpl 54.8H 0.00% 0.00% <lighttpd>
170506mclpl_top.txt:10: 11711 www -22 0 2008K 13M mclpl 188:25 0.00% 0.00% <lighttpd>
270406mclpl_top.txt:9: 29157 www -22 0 4424K 25M mclpl 22.4H 0.00% 0.00% <lighttpd>
grr.txt:9: 370 www -22 0 2368K 25M mclpl 283:54 0.00% 0.00% <lighttpd>
170506mclpl_ps.txt:21: www 11711 0.0 0.0 2008 12984 ? DW Sat11PM 188:25.32 /usr/pk 500 11711 1 13 -22 0 2008 12984 mclpl DW ? 188:25.32 /usr/pkg/sbin/lighttpd -f /usr/pkg/etc/lighttpd/lighttpd.conf
270406mclpl_ps.txt:22: www 29157 0.0 0.0 4424 25904 ? DW 16Apr06 1342:15.24 /usr/pk 500 29157 1 6 -22 0 4424 25904 mclpl DW ? 1342:15.24 /usr/pkg/sbin/lighttpd -f /usr/pkg/etc/lighttpd/lighttpd.conf
050206ps.txt:19: www 25980 0.0 0.0 2224 173248 ? DWs Fri05PM 237:43.33 /usr/pk 500 25980 1 8 -22 0 2224 173248 mclpl DWs ? 237:43.33 /usr/pkg/sbin/thttpd -C /usr/pkg/etc/thttpd.conf
110706mclpl_ps.txt:18:500 4011 1 3 -22 0 3840 11088 mclpl DW ? 3287:04.29 /usr/pkg/sbin/lighttpd -f /usr/pkg/etc/lighttpd/lighttpd.conf
----------------------------------------------------------------------
i just grepped the httpd-lines out of a top- or ps-output during that
situation. Donno if this helps anything.
I will try pavels idea
'netstat -mssv with a kernel built with "options MBUFTRACE"'
as next step.
tnx Stephan Pietzko