Re: [RFC PATCH] e1000e: Remove Other from EIAC.
From: Benjamin Poirier <hidden>
Date: 2018-01-19 05:37:09
Also in:
intel-wired-lan, lkml
Subsystem:
intel ethernet drivers, networking drivers, the rest · Maintainers:
Tony Nguyen, Przemek Kitszel, Andrew Lunn, "David S. Miller", Eric Dumazet, Jakub Kicinski, Paolo Abeni, Linus Torvalds
On 2018/01/18 18:42, Shrikrishna Khare wrote:
On Thu, 18 Jan 2018, Benjamin Poirier wrote:quoted
On 2018/01/18 15:50, Benjamin Poirier wrote:quoted
It was reported that emulated e1000e devices in vmware esxi 6.5 Build 7526125 do not link up after commit 4aea7a5c5e94 ("e1000e: Avoid receiver overrun interrupt bursts", v4.15-rc1). Some tracing shows that after e1000e_trigger_lsc() is called, ICR reads out as 0x0 in e1000_msix_other() on emulated e1000e devices. In comparison, on real e1000e 82574 hardware, icr=0x80000004 (_INT_ASSERTED | _OTHER) in the same situation. Some experimentation showed that this flaw in vmware e1000e emulation can be worked around by not setting Other in EIAC. This is how it was before 16ecba59bc33 ("e1000e: Do not read ICR in Other interrupt", v4.5-rc1).vmware folks, please comment.Thank you for bringing this to our attention. Using the reported build (ESX 6.5, 7526125) and 4.15.0-rc8+ kernel (which has the said patch), I could bring up e1000e interface (version: 3.2.6-k), get dhcp address and even do large file downloads without difficulty. Could you give us more pointers on how we may be able to reproduce this locally? Was there anything different with the configuration when the issue was observed? Is the issue consistently reproducible?
It's consistently reproducible, however I noticed that once in a while
there is a genuine "Other" interrupt that comes in and triggers the link
status change. The problem is with interrupts that are triggered via a
write to ICS (such as in e1000e_trigger_lsc()). Can you reproduce a
problem if you do:
ip link set ethX down
ip link set ethX up
If you're building your own kernel, you can add the following patch and
cat /sys/kernel/debug/tracing/trace_pipe
For me it shows on v4.15-rc8:
<...>-2578 [000] .... 83527.938321: e1000e_trigger_lsc: trigger_lsc
<...>-2578 [000] d.h. 83527.938398: e1000_msix_other: icr 0x0
With the patch that I submitted, it shows:
wickedd-1329 [002] .N.. 20.123545: e1000e_trigger_lsc: trigger_lsc
<idle>-0 [000] d.h. 20.123630: e1000_msix_other: icr 0x81000004
<idle>-0 [000] d.h. 20.123654: e1000_msix_other: lsc
<idle>-0 [000] d.h. 20.123676: e1000_msix_other: mod_timer
diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index 9f18d39bdc8f..16620ce840fc 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c@@ -1918,22 +1918,29 @@ static irqreturn_t e1000_msix_other(int __always_unused irq, void *data) bool enable = true; icr = er32(ICR); + trace_printk("icr 0x%x\n", icr); + if (icr & E1000_ICR_RXO) { + trace_printk("rxo\n"); ew32(ICR, E1000_ICR_RXO); enable = false; /* napi poll will re-enable Other, make sure it runs */ if (napi_schedule_prep(&adapter->napi)) { + trace_printk("napi schedule\n"); adapter->total_rx_bytes = 0; adapter->total_rx_packets = 0; __napi_schedule(&adapter->napi); } } if (icr & E1000_ICR_LSC) { + trace_printk("lsc\n"); ew32(ICR, E1000_ICR_LSC); hw->mac.get_link_status = true; /* guard against interrupt when we're going down */ - if (!test_bit(__E1000_DOWN, &adapter->state)) + if (!test_bit(__E1000_DOWN, &adapter->state)) { + trace_printk("mod_timer\n"); mod_timer(&adapter->watchdog_timer, jiffies + 1); + } } if (enable && !test_bit(__E1000_DOWN, &adapter->state))
@@ -4221,6 +4228,8 @@ static void e1000e_trigger_lsc(struct e1000_adapter *adapter) { struct e1000_hw *hw = &adapter->hw; + trace_printk("trigger_lsc\n"); + if (adapter->msix_entries) ew32(ICS, E1000_ICS_LSC | E1000_ICS_OTHER); else