Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: ixg tester needed (was Re: Problems with netbsd-8 RC1 and ixg drivers (?))





  The same diff is at:

     http://www.netbsd.org/~msaitoh/ixgbe-norearm-20180530-0.dif


Updated patch (Fix compile error and ixv patch):

------------------------------------------
 Don't call ixgbe_rearm_queues() in ixgbe_local_timer1(). ixgbe_enable_queue()
and ixgbe_disable_queue() try to enable/disable queue interrupt safely. It
has the internal counter. When a queue's MSI-X is received, ixgbe_msix_que()
is called (IPL_NET). This function disable the queue's interrupt by
ixgbe_disable_queue() and issues an softint. ixgbe_handle() queue is called by
the softint (IPL_SOFTNET), process TX,RX and call ixgbe_enable_queue() at the
end.

 ixgbe_local_timer1() is a callout and run always on CPU 0 (IPL_SOFTCLOCK).
When ixgbe_rearm_queues() called, an MSI-X interrupt is issued for a specific
queue. It may not CPU 0. If this interrupt's ixgbe_msix_que() is called
and sofint_schedule() is called before the last sofint's softint_execute()
is not called, the softint_schedule() fails because of SOFTINT_PENDING.
It result in breaking ixgbe_{enable,disable}_queue()'s internal counter.

 ixgbe_local_timer1() is written not to call ixgbe_rearm_queues() if
the interrupt is disabled, but it's called because of unknown bug or a race.

 One solution is to not to use the internal counter, but it's little difficult.
Another solution is stop using ixgbe_rearm_queues() at all.  Essentially,
ixgbe_rearm_queues() is not required (it was added in ixgbe.c rev. 1.43
(2016/12/01)). ixgbe_rearm_queues() helps for lost interrupt problem but
I've never seen it other than ixgbe_rearm_queues() problem.


Index: ixgbe.c
===================================================================
RCS file: /cvsroot/src/sys/dev/pci/ixgbe/ixgbe.c,v
retrieving revision 1.158
diff -u -p -r1.158 ixgbe.c
--- ixgbe.c	30 May 2018 09:17:17 -0000	1.158
+++ ixgbe.c	1 Jun 2018 03:22:05 -0000
@@ -4411,6 +4411,7 @@ ixgbe_local_timer1(void *arg)
 	/* Only truely watchdog if all queues show hung */
 	if (hung == adapter->num_queues)
 		goto watchdog;
+#if 0 /* XXX Avoid unexpectedly disabling interrupt forever (PR#53294) */
 	else if (queues != 0) { /* Force an IRQ on queues with work */
 		que = adapter->queues;
 		for (i = 0; i < adapter->num_queues; i++, que++) {
@@ -4421,6 +4422,7 @@ ixgbe_local_timer1(void *arg)
 			mutex_exit(&que->dc_mtx);
 		}
 	}
+#endif
out:
 	callout_reset(&adapter->timer, hz, ixgbe_local_timer, adapter);
@@ -6643,7 +6645,7 @@ ixgbe_handle_link(void *context)
 /************************************************************************
  * ixgbe_rearm_queues
  ************************************************************************/
-static void
+static __inline void
 ixgbe_rearm_queues(struct adapter *adapter, u64 queues)
 {
 	u32 mask;
Index: ixv.c
===================================================================
RCS file: /cvsroot/src/sys/dev/pci/ixgbe/ixv.c,v
retrieving revision 1.102
diff -u -p -r1.102 ixv.c
--- ixv.c	30 May 2018 08:35:26 -0000	1.102
+++ ixv.c	1 Jun 2018 03:22:05 -0000
@@ -1266,9 +1266,11 @@ ixv_local_timer_locked(void *arg)
 	/* Only truly watchdog if all queues show hung */
 	if (hung == adapter->num_queues)
 		goto watchdog;
+#if 0
 	else if (queues != 0) { /* Force an IRQ on queues with work */
 		ixv_rearm_queues(adapter, queues);
 	}
+#endif
callout_reset(&adapter->timer, hz, ixv_local_timer, adapter); ------------------------------------------

The same diff is at:

	http://www.netbsd.org/~msaitoh/ixgbe-norearm-20180531-0.dif

--
-----------------------------------------------
                SAITOH Masanobu (msaitoh%execsw.org@localhost
                                 msaitoh%netbsd.org@localhost)


Home | Main Index | Thread Index | Old Index