Port-xen archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: Dom0 xvif mbuf issues
On Thu, 4 Oct 2018 15:39:47 -0700
Harry Waddell <waddell%caravaninfotech.com@localhost> wrote:
> On Thu, 4 Oct 2018 17:01:21 +0200
> Manuel Bouyer <bouyer%antioche.eu.org@localhost> wrote:
>
> > On Mon, Oct 01, 2018 at 05:10:19PM -0700, Harry Waddell wrote:
> > > > Looks like temporary memory shortage in the dom0 (this is a MGETHDR failing,
> > > > not MCLGET, so the nmbclusters limit is not relevant).
> > > > How many mbufs were allocated ?
> > > >
> > > At the time of the hang, I have no idea.
> > >
> > > It's around 512 whenever I check.
> > >
> > > [root@xen-09:conf]> netstat -m
> > > 515 mbufs in use:
> > > 513 mbufs allocated to data
> > > 2 mbufs allocated to packet headers
> > > 0 calls to protocol drain routines
> >
> > Looks like the receive buffers for the ethernet interface.
> >
> > > > > It hung again, but with a new
> > > > > error scrolling on the console. "xennetback: got only 63 new mcl pages"
> > > >
> > > > This would point to a memory shortage in the hypervisor itself.
> > > > Do you have enough free memory (xl info) ?
> > > >
> > > total_memory : 131037
> > > free_memory : 26601
> > > sharing_freed_memory : 0
> > > sharing_used_memory : 0
> > > outstanding_claims : 0
> >
> > that should be plenty. No idea why xennetback couldn't get the 64 pages
> > it asked.
> >
> > > > > This is a production system with about 30 guests. I just want it to work like it used to.
> > > >
> > > > how many vifs is there in the dom0 ?
> > > >
> > >
> > > I expect this is not an ideal way to do this but ...
> > >
> > > (for i in `xl list | awk '{print $1}'`;do xl network-list $i | grep vif ;done) | wc -l
> > > 57
> > >
> > > Several of the systems are part of a cluster where hosts are multihomed on 2 of 4 networks
> > > to test a customer setup. Most of my systems have < 30, except for one other with 42.
> > > The others don't hang like this one does.
> >
> > I have more than 100 here.
> >
> > Maybe you should try reverting
> > kern.sbmax=1048576
> > net.inet.tcp.recvbuf_max=1048576
> > net.inet.tcp.sendbuf_max=1048576
> >
> > to their default values.
> >
> > --
> > Manuel Bouyer <bouyer%antioche.eu.org@localhost>
> > NetBSD: 26 ans d'experience feront toujours la difference
> > --
>
> No doubt sage advice, but it feels like we're missing something. All my servers have the same
> values, or higher, e.g.
>
> [root@xen-12:~]> netstat -m
> 2050 mbufs in use:
> 2049 mbufs allocated to data
> 1 mbufs allocated to packet headers
> 200 calls to protocol drain routines
>
> [root@xen-12:~]> egrep -v '^#' /etc/sysctl.conf
> ddb.onpanic?=0
> kern.sbmax=4194304
> net.inet.tcp.sendbuf_max=1048576
> net.inet.tcp.recvbuf_max=1048576
>
> [root@xen-12:~]> xl info | grep mem
> total_memory : 262032
> free_memory : 116527
> sharing_freed_memory : 0
> sharing_used_memory : 0
> xen_commandline : dom0_mem=8192M,max:8192M sched=credit2
> dom0_nodes=1 dom0_max_vcpus=1 dom0_vcpus_pin
>
> and no similar problems on any of those.
>
> I brought up the azure-fuse system and pushed a bunch of data through it. At no point
> did the mbuf use go above 700. I'm going to try and gather some more data during the upcoming
> weekend automated testing to see if it's exercising things in a weird way before I change anything else
> and return sbmax to the default, etc...
>
> I put this crude command into crontab:
>
> logger -p local0.warn `netstat -m | tr '\n' '|'`
>
> so it might be interesting to see what happens this weekend.
>
> Thanks again.
>
> HW
>
Well it crashed again but I captured some data. It look like the number of call to
drain routines which is normally zero. started climbing before the crash.
There was also a spike in the number of allocated mbufs used at the beginning of the period
of mbuf growth:
Oct 5 19:45:00 xen-09 root: 513 mbufs in use:| 512 mbufs allocated to data| 1 mbufs allocated to packet headers|2 calls to protocol drain routines|
Oct 5 19:46:00 xen-09 root: 867 mbufs in use:| 866 mbufs allocated to data| 1 mbufs allocated to packet headers|4 calls to protocol drain routines|
Oct 5 19:47:00 xen-09 root: 513 mbufs in use:| 512 mbufs allocated to data| 1 mbufs allocated to packet headers|8 calls to protocol drain routines|
...
Oct 5 20:10:00 xen-09 root: 515 mbufs in use:| 514 mbufs allocated to data| 1 mbufs allocated to packet headers|76 calls to protocol drain routines|
Oct 5 20:11:00 xen-09 root: 515 mbufs in use:| 514 mbufs allocated to data| 1 mbufs allocated to packet headers|76 calls to protocol drain routines|
Oct 5 20:12:00 xen-09 root: 529 mbufs in use:| 528 mbufs allocated to data| 1 mbufs allocated to packet headers|78 calls to protocol drain routines|
HW
Home |
Main Index |
Thread Index |
Old Index