Subject: Re: problems with nmbcluster (?)
To: None <6bone@6bone.informatik.uni-leipzig.de>
From: Manuel Bouyer <bouyer@antioche.eu.org>
List: tech-net
Date: 01/07/2007 22:16:59
On Sun, Jan 07, 2007 at 08:28:10PM +0100, 6bone@6bone.informatik.uni-leipzig.de wrote:
> On Sun, 7 Jan 2007, Manuel Bouyer wrote:
>
> >Date: Sun, 7 Jan 2007 19:09:59 +0100
> >From: Manuel Bouyer <bouyer@antioche.eu.org>
> >To: 6bone@6bone.informatik.uni-leipzig.de
> >Cc: tech-net@NetBSD.org
> >Subject: Re: problems with nmbcluster (?)
> >
> >On Sun, Jan 07, 2007 at 05:44:48PM +0100,
> >6bone@6bone.informatik.uni-leipzig.de wrote:
> >>hello,
> >>
> >>I have some problems with the network. I have to restart my server
> >>continuously, because after some days the server loses all connection to
> >>the network. You cannot establish any connections or do any pings. You can
> >>only restart the server. After the restart everything works fine for some
> >>days.....
> >>
> >>I have tested some kernels (3.0, 3.1, current....) but always the same
> >>effect occurs. On the server runs no special service. Only apache2 and
> >>postgresql from the pkgsrc. I don't know why the problem only occurs at my
> >>system. It is a dual i386/PIII with enabled IPv6 and an intel nic.
> >>
> >>I cannot give you more special hints. Only one output from 'netstat -mss'
> >>after the connection was lost:
> >>
> >>1441 mbufs in use:
> >> 1150 mbufs allocated to data
> >> 291 mbufs allocated to packet headers
> >>132521 calls to protocol drain routines
> >>
> >>
> >>Can anyone give me a hint for a possible solution or workaround? The
> >>continuous restarts are not longer possible. I have already exchanged the
> >>complete hard- and software.
> >
> >What does 'vmstat -m|grep mclpl' shows ?
> >
> >--
> >Manuel Bouyer <bouyer@antioche.eu.org>
> > NetBSD: 26 ans d'experience feront toujours la difference
> >--
> >
>
> the uptime at the moment is only 4h - so I can only report the actual
> output:
>
> netstat -mss && vmstat -m|grep mclpl
>
> 1497 mbufs in use:
> 1110 mbufs allocated to data
> 387 mbufs allocated to packet headers
> 34 calls to protocol drain routines
>
> vmstat: Kmem statistics are not being gathered by the kernel.
> mclpl 2048 1578 0 938 408 74 334 398 4 512
I suspect your system is running out of mclpl on occasion, and this cause the
network atapter (or the IP stack) to stall. Try bumping nmbclusters.
For example on ftp.fr.netbsd.org I have it set to 8192.
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--