Re: [PATCH] net: ftgmac100: Fix missing TX-poll issue
From: Jakub Kicinski <kuba@kernel.org>
Date: 2020-10-20 02:57:28
Also in:
linux-arch, linux-aspeed, lkml, openbmc
On Tue, 20 Oct 2020 10:23:41 +1100 Benjamin Herrenschmidt wrote:
On Mon, 2020-10-19 at 12:00 -0700, Jakub Kicinski wrote:quoted
On Mon, 19 Oct 2020 08:57:03 +0000 Joel Stanley wrote:quoted
quoted
diff --git a/drivers/net/ethernet/faraday/ftgmac100.cb/drivers/net/ethernet/faraday/ftgmac100.c index 00024dd41147..9a99a87f29f3 100644--- a/drivers/net/ethernet/faraday/ftgmac100.c +++ b/drivers/net/ethernet/faraday/ftgmac100.c@@ -804,7 +804,8 @@ static netdev_tx_tftgmac100_hard_start_xmit(struct sk_buff *skb, * before setting the OWN bit on the first descriptor. */ dma_wmb(); - first->txdes0 = cpu_to_le32(f_ctl_stat); + WRITE_ONCE(first->txdes0, cpu_to_le32(f_ctl_stat)); + READ_ONCE(first->txdes0);I understand what you're trying to do here, but I'm not sure that this is the correct way to go about it. It does cause the compiler to produce a store and then a load.+1 @first is system memory from dma_alloc_coherent(), right? You shouldn't have to do this. Is coherent DMA memory broken on your platform?I suspect the problem is that the HW (and yes this would be a HW bug) doesn't order the CPU -> memory and the CPU -> MMIO path. What I think happens is that the store to txde0 is potentially still in a buffer somewhere on its way to memory, gets bypassed by the store to MMIO, causing the MAC to try to read the descriptor, and getting the "old" data from memory.
I see, but in general this sort of a problem should be resolved by adding an appropriate memory barrier. And in fact such barrier should (these days) be implied by a writel (I'm not 100% clear on why this driver uses iowrite, and if it matters).
It's ... fishy, but that isn't the first time an Aspeed chip has that type of bug (there's a similar one in the USB device controler iirc).
Argh.