Source-Changes-HG archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

[src/trunk]: src/sys/dev/pci/ixgbe Fix another poll mode assumption breaking....



details:   https://anonhg.NetBSD.org/src/rev/9ab6dd701d0d
branches:  trunk
changeset: 360345:9ab6dd701d0d
user:      knakahara <knakahara%NetBSD.org@localhost>
date:      Wed Mar 07 11:18:29 2018 +0000

description:
Fix another poll mode assumption breaking. Implemented by msaitoh@n.o, I just commit by proxy.

ixgbe_rearm_queues() writes EICS register(s). 82599, X540 and X550
specifications say "Following a write of 1b to any bit in the EICS register
(interrupt cause set), its corresponding bit in the EIMS register is auto
set as well enabling its interrupt." in "Extended Interrupt Auto Mask Enable
(EIAM) Register" section. That is, ixgbe_rearm_queues() causes interrupts
regardless of the status managed by ixgbe_enable_queue()/ixgbe_disable_queue().
That can break poll mode assumption.

In fact, the problem occurs in the following situation
    - CPU#A has high load traffic, in contrast, CPU#B has not so high load traffic
    - CPU#A is occurred interrupt by its NIC queue
      - CPU#A calls ixgbe_disable_queue() in interrupt handler(ixgbe_msix_que())
      - CPU#A kick softint handler(ixgbe_handle_que())
        - CPU#A begins softint
        - CPU#A's NIC queue is set que->txr->busy flag
        - With some reason, CPU#A can do ixg interrupt handler
          E.g. when one of CPU#A's softnet handlers sleeps, ipl is lowered
    - CPU#B starts callout
      - CPU#B calls ixgbe_local_timer1()
        - CPU#B writes EICS bit corresponding CPU#A's NIC queue bit
    - CPU#A's NIC queue causes interrupt whie CPU#A is running in poll mode
      - CPU#A calls ixgbe_disable_queue() in interrupt handler *again*
    - CPU#A has done polling, and then CPU#A calls ixgbe_enable_queue() *once*
    - CPU#A's NIC queue interrupt is disabled until ixg is detached as
      ixgbe_disable_queue() is called twice though ixgbe_disable_queue() is
      called once only

NOTE:
82598 does not say so, but it is treated in the same way because of no harm.

By the way, we will refactor ixgbe_local_timer(watchdog processing) later.

XXX pullup-8

diffstat:

 sys/dev/pci/ixgbe/ixgbe.c |  11 +++++++++--
 1 files changed, 9 insertions(+), 2 deletions(-)

diffs (25 lines):

diff -r 850bc0ba1022 -r 9ab6dd701d0d sys/dev/pci/ixgbe/ixgbe.c
--- a/sys/dev/pci/ixgbe/ixgbe.c Wed Mar 07 10:53:55 2018 +0000
+++ b/sys/dev/pci/ixgbe/ixgbe.c Wed Mar 07 11:18:29 2018 +0000
@@ -1,4 +1,4 @@
-/* $NetBSD: ixgbe.c,v 1.130 2018/03/07 08:01:32 msaitoh Exp $ */
+/* $NetBSD: ixgbe.c,v 1.131 2018/03/07 11:18:29 knakahara Exp $ */
 
 /******************************************************************************
 
@@ -4229,7 +4229,14 @@
        if (hung == adapter->num_queues)
                goto watchdog;
        else if (queues != 0) { /* Force an IRQ on queues with work */
-               ixgbe_rearm_queues(adapter, queues);
+               que = adapter->queues;
+               for (int i = 0; i < adapter->num_queues; i++, que++) {
+                       mutex_enter(&que->im_mtx);
+                       if (que->im_nest == 0)
+                               ixgbe_rearm_queues(adapter,
+                                   queues & ((u64)1 << i));
+                       mutex_exit(&que->im_mtx);
+               }
        }
 
 out:



Home | Main Index | Thread Index | Old Index