Re: xennet performance collapses with multiple vCPU

To: Manuel Bouyer <bouyer%antioche.eu.org@localhost>
Subject: Re: xennet performance collapses with multiple vCPU
From: Andrew Doran <ad%netbsd.org@localhost>
Date: Mon, 13 Apr 2020 16:01:39 +0000

Hi,

On Mon, Apr 13, 2020 at 01:44:10PM +0200, Manuel Bouyer wrote:
> On Sun, Apr 05, 2020 at 08:57:45PM +0000, Andrew Doran wrote:
> > > [...]
> > > I've now tracked it down to this change:
> > > 
> > > Module Name:    src
> > > Committed By:   ad
> > > Date:           Mon Jan 13 20:30:08 UTC 2020
> > > 
> > > Modified Files:
> > >         src/sys/kern: subr_cpu.c
> > > 
> > > Log Message:
> > > Fix some more bugs in the topo stuff, that prevented it from working
> > > properly with fake topo info + MP.
> > > 
> > > 
> > > To generate a diff of this commit:
> > > cvs rdiff -u -r1.10 -r1.11 src/sys/kern/subr_cpu.c
> > > 
> > > After this change the DomU even boots visibly slower. Maybe this change
> > > makes MP system scheduler use all CPUs, but introduces too much switching
> > > between them? Andy, can you have a look?
> > > 
> > > I'll meanwhile check if there is anything obvious in the fake topology code.
> > 
> > I spent some time looking into this over the weekend.  It's easily
> > reproducible, and I don't see anything that looks strange on the various
> > systems involved.  I also don't see why it would be related to the
> > scheduler.
> 
> Hello,
> some more data on this issue. With ping I can notice a consistent 10ms delay:
> PING nephtys.lip6.fr (195.83.118.1): 56 data bytes
> 64 bytes from 195.83.118.1: icmp_seq=0 ttl=253 time=8.964715 ms
> 64 bytes from 195.83.118.1: icmp_seq=1 ttl=253 time=10.080450 ms
> 64 bytes from 195.83.118.1: icmp_seq=2 ttl=253 time=10.079291 ms
> 64 bytes from 195.83.118.1: icmp_seq=3 ttl=253 time=10.079525 ms
> 64 bytes from 195.83.118.1: icmp_seq=4 ttl=253 time=10.083389 ms
> 64 bytes from 195.83.118.1: icmp_seq=5 ttl=253 time=10.080444 ms
> 64 bytes from 195.83.118.1: icmp_seq=6 ttl=253 time=10.079615 ms
> 64 bytes from 195.83.118.1: icmp_seq=7 ttl=253 time=10.081661 ms
> 
> Sometimes it drops and stays at 5ms.
> 
> With a single CPU, the RTT is less than one millisecond.
> Keeping both CPUs busy wit a while(1) loop doesn't help.
> 
> It looks like something is delayed to the next clock tick.
> Note that the dom0 is idle and no other VMs are running.
> 
> I'm seeing the same behavior in the bouyer-xenpvh branch, where Xen
> now has fast softints and kpreempt. Disabling the later, or both, doens't
> change anything. I'm also seeing the same with a kernel from
> bouyer-xenpvh-base so it's not related to changes in the branch.
> 
> Any idea welcome.

This confirms my suspicion and is why I wanted to play with HZ.  I think
soft interrupt processing is being driven off the clock interrupt.  Maybe
hardware interrupt processing but I think that's less likely.  Native x86
configured the same way does not behave like this.  Hmm.  I will play around
with it some more.

Andrew

Follow-Ups:
- Re: xennet performance collapses with multiple vCPU
  - From: Manuel Bouyer

References:
- xennet performance collapses with multiple vCPU
  - From: Stephen Borrill
- Re: xennet performance collapses with multiple vCPU
  - From: Jaromír Doleček
- Re: xennet performance collapses with multiple vCPU
  - From: Jaromír Doleček
- Re: xennet performance collapses with multiple vCPU
  - From: Andrew Doran
- Re: xennet performance collapses with multiple vCPU
  - From: Manuel Bouyer

Prev by Date: Re: xennet performance collapses with multiple vCPU
Next by Date: Re: xennet performance collapses with multiple vCPU
Previous by Thread: Re: xennet performance collapses with multiple vCPU
Next by Thread: Re: xennet performance collapses with multiple vCPU
Indexes:

Home | Main Index | Thread Index | Old Index