Re: [PATCH v4 net] net: solve a NAPI race
From: David Miller <davem@davemloft.net>
Date: 2017-03-01 17:53:09
From: Eric Dumazet <redacted> Date: Tue, 28 Feb 2017 10:34:50 -0800
From: Eric Dumazet <edumazet@google.com> While playing with mlx4 hardware timestamping of RX packets, I found that some packets were received by TCP stack with a ~200 ms delay... Since the timestamp was provided by the NIC, and my probe was added in tcp_v4_rcv() while in BH handler, I was confident it was not a sender issue, or a drop in the network. This would happen with a very low probability, but hurting RPC workloads. A NAPI driver normally arms the IRQ after the napi_complete_done(), after NAPI_STATE_SCHED is cleared, so that the hard irq handler can grab it. Problem is that if another point in the stack grabs NAPI_STATE_SCHED bit while IRQ are not disabled, we might have later an IRQ firing and finding this bit set, right before napi_complete_done() clears it. This can happen with busy polling users, or if gro_flush_timeout is used. But some other uses of napi_schedule() in drivers can cause this as well.
...
Signed-off-by: Eric Dumazet <edumazet@google.com>
Applied, thanks Eric.