Re: [RFC PATCH] e1000e: Remove Other from EIAC.

From: Benjamin Poirier <hidden>
Date: 2018-01-19 05:37:09
Also in: intel-wired-lan, lkml
Subsystem: intel ethernet drivers, networking drivers, the rest · Maintainers: Tony Nguyen, Przemek Kitszel, Andrew Lunn, "David S. Miller", Eric Dumazet, Jakub Kicinski, Paolo Abeni, Linus Torvalds

On 2018/01/18 18:42, Shrikrishna Khare wrote:


On Thu, 18 Jan 2018, Benjamin Poirier wrote:

quoted

On 2018/01/18 15:50, Benjamin Poirier wrote:

quoted

It was reported that emulated e1000e devices in vmware esxi 6.5 Build
7526125 do not link up after commit 4aea7a5c5e94 ("e1000e: Avoid receiver
overrun interrupt bursts", v4.15-rc1). Some tracing shows that after
e1000e_trigger_lsc() is called, ICR reads out as 0x0 in e1000_msix_other()
on emulated e1000e devices. In comparison, on real e1000e 82574 hardware,
icr=0x80000004 (_INT_ASSERTED | _OTHER) in the same situation.

Some experimentation showed that this flaw in vmware e1000e emulation can
be worked around by not setting Other in EIAC. This is how it was before
16ecba59bc33 ("e1000e: Do not read ICR in Other interrupt", v4.5-rc1).

vmware folks, please comment.

Thank you for bringing this to our attention.

Using the reported build (ESX 6.5, 7526125) and 4.15.0-rc8+ kernel (which 
has the said patch), I could bring up e1000e interface (version: 3.2.6-k),
get dhcp address and even do large file downloads without difficulty.

Could you give us more pointers on how we may be able to reproduce this 
locally? Was there anything different with the configuration when the 
issue was observed? Is the issue consistently reproducible?

It's consistently reproducible, however I noticed that once in a while
there is a genuine "Other" interrupt that comes in and triggers the link
status change. The problem is with interrupts that are triggered via a
write to ICS (such as in e1000e_trigger_lsc()). Can you reproduce a
problem if you do:
	ip link set ethX down
	ip link set ethX up

If you're building your own kernel, you can add the following patch and
cat /sys/kernel/debug/tracing/trace_pipe

For me it shows on v4.15-rc8:
           <...>-2578  [000] .... 83527.938321: e1000e_trigger_lsc: trigger_lsc
           <...>-2578  [000] d.h. 83527.938398: e1000_msix_other: icr 0x0

With the patch that I submitted, it shows:
         wickedd-1329  [002] .N..    20.123545: e1000e_trigger_lsc: trigger_lsc
          <idle>-0     [000] d.h.    20.123630: e1000_msix_other: icr 0x81000004
          <idle>-0     [000] d.h.    20.123654: e1000_msix_other: lsc
          <idle>-0     [000] d.h.    20.123676: e1000_msix_other: mod_timer

diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index 9f18d39bdc8f..16620ce840fc 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c

@@ -1918,22 +1918,29 @@ static irqreturn_t e1000_msix_other(int __always_unused irq, void *data)
 	bool enable = true;
 
 	icr = er32(ICR);
+	trace_printk("icr 0x%x\n", icr);
+
 	if (icr & E1000_ICR_RXO) {
+		trace_printk("rxo\n");
 		ew32(ICR, E1000_ICR_RXO);
 		enable = false;
 		/* napi poll will re-enable Other, make sure it runs */
 		if (napi_schedule_prep(&adapter->napi)) {
+			trace_printk("napi schedule\n");
 			adapter->total_rx_bytes = 0;
 			adapter->total_rx_packets = 0;
 			__napi_schedule(&adapter->napi);
 		}
 	}
 	if (icr & E1000_ICR_LSC) {
+		trace_printk("lsc\n");
 		ew32(ICR, E1000_ICR_LSC);
 		hw->mac.get_link_status = true;
 		/* guard against interrupt when we're going down */
-		if (!test_bit(__E1000_DOWN, &adapter->state))
+		if (!test_bit(__E1000_DOWN, &adapter->state)) {
+			trace_printk("mod_timer\n");
 			mod_timer(&adapter->watchdog_timer, jiffies + 1);
+		}
 	}
 
 	if (enable && !test_bit(__E1000_DOWN, &adapter->state))

@@ -4221,6 +4228,8 @@ static void e1000e_trigger_lsc(struct e1000_adapter *adapter)
 {
 	struct e1000_hw *hw = &adapter->hw;
 
+	trace_printk("trigger_lsc\n");
+
 	if (adapter->msix_entries)
 		ew32(ICS, E1000_ICS_LSC | E1000_ICS_OTHER);
 	else

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help