Thread (20 messages) 20 messages, 4 authors, 2026-01-17

Re: [PATCH net v2] net: stmmac: fix transmit queue timed out after resume

From: "Russell King (Oracle)" <linux@armlinux.org.uk>
Date: 2026-01-16 00:50:53
Also in: linux-doc, lkml

On Thu, Jan 15, 2026 at 10:35:26PM +0100, Maxime Chevallier wrote:
Hi again,

On 15/01/2026 22:04, Maxime Chevallier wrote:
quoted
Hi,
quoted
I've just run iperf3 in both directions with the kernel I had on the
board (based on 6.18.0-rc7-net-next+), and stmmac really isn't looking
particularly great - by that I mean, iperf3 *failed* spectacularly.

First, running in normal mode (stmmac transmitting, x86 receiving)
it's only capable of 210Mbps, which is nowhere near line rate.

However, when running iperf3 in reverse mode, it filled the stmmac's
receive queue, which then started spewing PAUSE frames at a rate of
knots, flooding the network, and causing the entire network to stop.
It never recovered without rebooting.
 [...]
quoted
Heh, I was able to reproduce something similar on imx8mp, that has an
imx-dwmac (dwmac 4/5 according to dmesg) :

DUT to x86

Connecting to host 192.168.2.1, port 5201
[  5] local 192.168.2.13 port 54744 connected to 192.168.2.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  0.00 Bytes  0.00 bits/sec    2   1.41 KBytes
[  5]   1.00-2.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes

x86 to DUT :

Reverse mode, remote host 192.168.2.1 is sending
[  5] local 192.168.2.13 port 47050 connected to 192.168.2.1 port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec   112 MBytes   935 Mbits/sec
[  5]   1.00-2.00   sec   112 MBytes   936 Mbits/sec
[  5]   2.00-3.00   sec   112 MBytes   936 Mbits/sec

Nothing as bas as what you face, but there's defintely something going
on there. "good" news is that it worked in v6.19-rc1, I have a bisect
ongoing.

I'll update once I have homed-in on something.

Maxime
So the bisect results are in, at least for the problem I noticed. It's
not certain yet this is the same problem as Russell, and maybe not the
same as Tao Wang as well...

The culprit commit is :

commit 8409495bf6c907a5bc9632464dbdd8fb619f9ceb (HEAD)
Author: Russell King (Oracle) [off-list ref]
Date:   Thu Jan 8 17:36:40 2026 +0000

    net: stmmac: cores: remove many xxx_SHIFT definitions
    
    We have many xxx_SHIFT definitions along side their corresponding
    xxx_MASK definitions for the various cores. Manually using the
    shift and mask can be error prone, as shown with the dwmac4 RXFSTS
    fix patch.
    
    Convert sites that use xxx_SHIFT and xxx_MASK directly to use
    FIELD_GET(), FIELD_PREP(), and u32_replace_bits() as appropriate.
    
    Signed-off-by: Russell King (Oracle) [off-list ref]
    Link: https://patch.msgid.link/E1vdtw8-00000002Gtu-0Hyu@rmk-PC.armlinux.org.uk
    Signed-off-by: Jakub Kicinski [off-list ref]

Lore link :

https://lore.kernel.org/netdev/E1vdtw8-00000002Gtu-0Hyu@rmk-PC.armlinux.org.uk/ (local)

I confirm that iperf3 works perfectly in both directions before this commit,
and I get 0 bits/s when running "iperf3 -c my_host" on the DUT that has stmmac.

Looks like something happened while cleaning-up the macros for the various
definitions.
Thanks for finding the blame.

A few other interesting things... I have an old 6.14 kernel on the
platform, and that gives what I deem to be good transmit performance.
Receive performance is low, but it doesn't fail.

I wrote a shell script to use devmem2 to dump all the stmmac registers.
These seem more significant on the face of it... but I'm working it out
as I write this email:

-Value at address 0x02490010: 0x00010008
+Value at address 0x02490010: 0x00080008
-Value at address 0x02490014: 0x20020008
+Value at address 0x02490014: 0x20000008
-Value at address 0x02490018: 0x00000001
+Value at address 0x02490018: 0x04000001

These are GMAC_HASH_TAB()

-Value at address 0x02490060: 0x001a0000
+Value at address 0x02490060: 0x00120000

VLAN_ONCL, bit is VLAN_CSVL, changed in commit:
c657f86106c8 net: stmmac: vlan: Disable 802.1AD tag insertion offload.

-Value at address 0x024900c0: 0x01000000
+Value at address 0x024900c0: 0x05000000

GMAC_PMT - bit 26, part of the RWKPTR[4:0] bitfield, read-only.

-Value at address 0x02490d30: 0x0ff1c4a0
+Value at address 0x02490d30: 0x0ff1c4e0

MTL_CHAN_RX_OP_MODE(0) - bit 6 is different, MTL_OP_MODE_DIS_TCP_EF.

This is a change from:
fe4042797651 net: stmmac: dwmac4: stop hardware from dropping checksum-error packets

-Value at address 0x02491104: 0x00101011
+Value at address 0x02491104: 0x00001011

DMA_CHAN_TX_CONTROL(0) - but this is significant.
In dwmac4_dma_init_tx_chan(), we have:

-       value = value | (txpbl << DMA_BUS_MODE_PBL_SHIFT);
+       value = value | FIELD_PREP(DMA_BUS_MODE_PBL, txpbl);

and the corresponding change in the header file:

 /* DMA SYS Bus Mode bitmap */
 #define DMA_BUS_MODE_SPH               BIT(24)
 #define DMA_BUS_MODE_PBL               BIT(16)
-#define DMA_BUS_MODE_PBL_SHIFT         16
-#define DMA_BUS_MODE_RPBL_SHIFT                16
+#define DMA_BUS_MODE_RPBL_MASK         GENMASK(21, 16)
 #define DMA_BUS_MODE_MB                        BIT(14)
 #define DMA_BUS_MODE_FB                        BIT(0)

The combination of DMA_BUS_MODE_PBL and DMA_BUS_MODE_PBL_SHIFT leads
one to believe that this is a single bit field, whereas there
is another overlapping field called RPBL that is wider. RPBL gets
used for DMA_CHAN_RX_CONTROL, whereas PBL gets used for
DMA_CHAN_TX_CONTROL.

txpbl for the Jetson Xavier NX board (tegra194) is 16:

arch/arm64/boot/dts/nvidia/tegra194.dtsi:                       snps,txpbl = <16>;

which is txpbl. 16 doesn't fit into a single bit. The header file
was wrong.

According to non-Tegra documentation (the closest I have for
dwmac4 is stm32mp151), this field is called TXPBL[5:0] covering
bits 21:16 of this register, and is the transmit burst length.

However, while this may explain the transmit slowdown because it's
on the transmit side, it doesn't explain the receive problem.

-Value at address 0x0249113c: 0x000d07c0
+Value at address 0x0249113c: 0x000507c0

DMA_CHAN_SLOT_CTRL_STATUS(0) - bit 19 RSN[3:0] bit 3, readonly.

With the TXPBL thing fixed, for transmit I now get:

[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  1003 MBytes   841 Mbits/sec    0             sender
[  5]   0.00-10.01  sec  1002 MBytes   839 Mbits/sec                  receiver

which is way better, but receive still fails, with a storm of
PAUSE, with RBU set.

Transmit fix (eventually):
https://lore.kernel.org/r/E1vgY1k-00000003vOC-0Z1H@rmk-PC.armlinux.org.uk (local)

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help