Port-xen archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: NetBSD/xen goes off the network - reproduceable
On Fri, Feb 17, 2012 at 10:26:42PM -0500, Brian Marcotte wrote:
> > I guess most receive buffers ends up in the socket, but there should be
> > still
> > one available to make progress. I guess there's a bug somewhere and this
> > one is not reused.
> > Can you see what happens in xennet_rx_mbuf_free especially for the
> > sc->sc_free_rxreql and SC_NLIVEREQ(sc) numbers ?
>
> For this test, I'm printing those values right at the start of
> xennet_rx_mbuf_free. Also, in xennet_handler I'm printing the values of
> "i" and sc->sc_free_rxreql when it enters the code where it is about to
> do a copy.
>
> A complete console log is available here:
>
> http://www.panix.com/~marcotte/consolelog.txt
>
> Here is a summary. The most interesting part is probably at the bottom
> where the network stops completely.
>
> Thanks.
>
> ------------
>
> xennet_rx_mbuf_free: sc->sc_free_rxreql=0 SC_NLIVEREQ(sc)=251
> xennet_rx_mbuf_free: sc->sc_free_rxreql=1 SC_NLIVEREQ(sc)=251
> xennet_rx_mbuf_free: sc->sc_free_rxreql=2 SC_NLIVEREQ(sc)=251
> xennet_rx_mbuf_free: sc->sc_free_rxreql=3 SC_NLIVEREQ(sc)=251
> xennet_rx_mbuf_free: sc->sc_free_rxreql=4 SC_NLIVEREQ(sc)=251
> ...
> xennet_rx_mbuf_free: sc->sc_free_rxreql=47 SC_NLIVEREQ(sc)=206
> xennet_rx_mbuf_free: sc->sc_free_rxreql=48 SC_NLIVEREQ(sc)=206
> xennet_rx_mbuf_free: sc->sc_free_rxreql=49 SC_NLIVEREQ(sc)=206
> # mount /
> # /etc/rc.d/network start
> xennet_rx_mbuf_free: sc->sc_free_rxreql=50 SC_NLIVEREQ(sc)=188
> xennet_rx_mbuf_free: sc->sc_free_rxreql=51 SC_NLIVEREQ(sc)=188
> xennet_rx_mbuf_free: sc->sc_free_rxreql=52 SC_NLIVEREQ(sc)=188
> ...
> xennet_rx_mbuf_free: sc->sc_free_rxreql=65 SC_NLIVEREQ(sc)=188
> xennet_rx_mbuf_free: sc->sc_free_rxreql=66 SC_NLIVEREQ(sc)=188
> xennet_rx_mbuf_free: sc->sc_free_rxreql=67 SC_NLIVEREQ(sc)=188
> Starting network.
> Hostname: mail3.panix.com
> IPv6 mode: host
> xennet_rx_mbuf_free: sc->sc_free_rxreql=68 SC_NLIVEREQ(sc)=187
> Configuring network interfaces: xennet0.
> Adding interface aliases:.
> add net default: gateway 166.84.1.65
> xennet_rx_mbuf_free: sc->sc_free_rxreql=69 SC_NLIVEREQ(sc)=186
> xennet_rx_mbuf_free: sc->sc_free_rxreql=70 SC_NLIVEREQ(sc)=185
> xennet_rx_mbuf_free: sc->sc_free_rxreql=71 SC_NLIVEREQ(sc)=181
> ...
> # /etc/rc.d/sshd start
> [ log in remotely and start test with telnet]
> xennet_rx_mbuf_free: sc->sc_free_rxreql=42 SC_NLIVEREQ(sc)=213
> xennet_rx_mbuf_free: sc->sc_free_rxreql=43 SC_NLIVEREQ(sc)=212
> xennet_rx_mbuf_free: sc->sc_free_rxreql=44 SC_NLIVEREQ(sc)=211
> ...
> xennet_rx_mbuf_free: sc->sc_free_rxreql=110 SC_NLIVEREQ(sc)=134
> xennet_rx_mbuf_free: sc->sc_free_rxreql=111 SC_NLIVEREQ(sc)=132
> xennet_rx_mbuf_free: sc->sc_free_rxreql=112 SC_NLIVEREQ(sc)=132
> ...
> xennet_rx_mbuf_free: sc->sc_free_rxreql=124 SC_NLIVEREQ(sc)=131
> xennet_rx_mbuf_free: sc->sc_free_rxreql=125 SC_NLIVEREQ(sc)=130
> xennet_rx_mbuf_free: sc->sc_free_rxreql=126 SC_NLIVEREQ(sc)=128
> xennet_rx_mbuf_free: sc->sc_free_rxreql=0 SC_NLIVEREQ(sc)=252
> xennet_rx_mbuf_free: sc->sc_free_rxreql=1 SC_NLIVEREQ(sc)=249
> xennet_rx_mbuf_free: sc->sc_free_rxreql=2 SC_NLIVEREQ(sc)=248
> ...
> xennet_rx_mbuf_free: sc->sc_free_rxreql=37 SC_NLIVEREQ(sc)=113
> xennet_rx_mbuf_free: sc->sc_free_rxreql=38 SC_NLIVEREQ(sc)=108
> xennet_rx_mbuf_free: sc->sc_free_rxreql=39 SC_NLIVEREQ(sc)=103
> xennet_rx_mbuf_free: sc->sc_free_rxreql=40 SC_NLIVEREQ(sc)=97
> xennet_rx_mbuf_free: sc->sc_free_rxreql=41 SC_NLIVEREQ(sc)=96
> xennet_rx_mbuf_free: sc->sc_free_rxreql=42 SC_NLIVEREQ(sc)=96
> ...
> xennet_rx_mbuf_free: sc->sc_free_rxreql=0 SC_NLIVEREQ(sc)=135
> xennet_rx_mbuf_free: sc->sc_free_rxreql=1 SC_NLIVEREQ(sc)=130
> xennet_rx_mbuf_free: sc->sc_free_rxreql=2 SC_NLIVEREQ(sc)=128
> # netstat -f inet -n
> Active Internet connections
> Proto Recv-Q Send-Q Local Address Foreign Address State
> tcp 6800 0 166.84.1.74.65534 166.84.1.3.23 ESTABLISHED
> tcp 0 0 166.84.1.74.22 166.84.1.253.607 ESTABLISHED
> [ Recv-Q starting to fill up ]
> xennet_rx_mbuf_free: sc->sc_free_rxreql=3 SC_NLIVEREQ(sc)=126
> xennet_rx_mbuf_free: sc->sc_free_rxreql=4 SC_NLIVEREQ(sc)=126
> xennet_rx_mbuf_free: sc->sc_free_rxreql=5 SC_NLIVEREQ(sc)=123
> ...
> xennet_rx_mbuf_free: sc->sc_free_rxreql=10 SC_NLIVEREQ(sc)=14
> xennet_rx_mbuf_free: sc->sc_free_rxreql=11 SC_NLIVEREQ(sc)=14
> xennet_rx_mbuf_free: sc->sc_free_rxreql=12 SC_NLIVEREQ(sc)=10
> xennet_rx_mbuf_free: sc->sc_free_rxreql=0 SC_NLIVEREQ(sc)=20
> xennet_rx_mbuf_free: sc->sc_free_rxreql=1 SC_NLIVEREQ(sc)=16
> xennet_rx_mbuf_free: sc->sc_free_rxreql=2 SC_NLIVEREQ(sc)=13
> xennet_rx_mbuf_free: sc->sc_free_rxreql=3 SC_NLIVEREQ(sc)=8
> xennet_rx_mbuf_free: sc->sc_free_rxreql=4 SC_NLIVEREQ(sc)=8
> xennet_rx_mbuf_free: sc->sc_free_rxreql=5 SC_NLIVEREQ(sc)=8
> xennet_rx_mbuf_free: sc->sc_free_rxreql=6 SC_NLIVEREQ(sc)=6
>
> Active Internet connections
> Proto Recv-Q Send-Q Local Address Foreign Address State
> tcp 13220 0 166.84.1.74.65534 166.84.1.3.23 ESTABLISHED
> tcp 0 0 166.84.1.74.22 166.84.1.253.607 ESTABLISHED
> xennet_rx_mbuf_free: sc->sc_free_rxreql=0 SC_NLIVEREQ(sc)=10
> xennet_rx_mbuf_free: sc->sc_free_rxreql=1 SC_NLIVEREQ(sc)=6
> xennet_rx_mbuf_free: sc->sc_free_rxreql=2 SC_NLIVEREQ(sc)=3
> xennet_rx_mbuf_free: sc->sc_free_rxreql=0 SC_NLIVEREQ(sc)=2
> xennet_handler: copying packet: i=901 free_rxreql=1
> xennet_rx_mbuf_free: sc->sc_free_rxreql=1 SC_NLIVEREQ(sc)=0
At this point, there's no space in the ring to receive new packets
> xennet_handler: copying packet: i=902 free_rxreql=0
> xennet_rx_mbuf_free: sc->sc_free_rxreql=0 SC_NLIVEREQ(sc)=0
> xennet_rx_mbuf_free: sc->sc_free_rxreql=0 SC_NLIVEREQ(sc)=1
but there is again one slot in the ring; this should be enough to make
limited progress.
> #
> [ XXXXX network has stopped completely XXXXX ]
> #
> #
> #
> #
> #
> #
> #
> #
> # pkill telnet
> xennet_rx_mbuf_free: sc->sc_free_rxreql=0 SC_NLIVEREQ(sc)=2
> xennet_rx_mbuf_free: sc->sc_free_rxreql=1 SC_NLIVEREQ(sc)=2
> xennet_rx_mbuf_free: sc->sc_free_rxreql=0 SC_NLIVEREQ(sc)=4
> xennet_rx_mbuf_free: sc->sc_free_rxreql=1 SC_NLIVEREQ(sc)=4
> xennet_rx_mbuf_free: sc->sc_free_rxreql=2 SC_NLIVEREQ(sc)=4
> xennet_rx_mbuf_free: sc->sc_free_rxreql=3 SC_NLIVEREQ(sc)=4
> xennet_rx_mbuf_free: sc->sc_free_rxreql=0 SC_NLIVEREQ(sc)=8
> xennet_rx_mbuf_free: sc->sc_free_rxreql=1 SC_NLIVEREQ(sc)=8
> xennet_rx_mbuf_free: sc->sc_free_rxreql=2 SC_NLIVEREQ(sc)=8
> xennet_rx_mbuf_free: sc->sc_free_rxreql=3 SC_NLIVEREQ(sc)=8
> ...
> xennet_rx_mbuf_free: sc->sc_free_rxreql=30 SC_NLIVEREQ(sc)=32
> xennet_rx_mbuf_free: sc->sc_free_rxreql=31 SC_NLIVEREQ(sc)=32
> xennet_rx_mbuf_free: sc->sc_free_rxreql=0 SC_NLIVEREQ(sc)=64
> xennet_rx_mbuf_free: sc->sc_free_rxreql=1 SC_NLIVEREQ(sc)=64
> ...
> xennet_rx_mbuf_free: sc->sc_free_rxreql=60 SC_NLIVEREQ(sc)=195
> xennet_rx_mbuf_free: sc->sc_free_rxreql=61 SC_NLIVEREQ(sc)=194
And at this point, the network has not restarted ?
When this happen, can you check the flags of the correponding
xvif interface in the backend ?
Can you give details about your dom0 ?
--
Manuel Bouyer <bouyer%antioche.eu.org@localhost>
NetBSD: 26 ans d'experience feront toujours la difference
--
Home |
Main Index |
Thread Index |
Old Index