Thread (5 messages) 5 messages, 2 authors, 2023-10-27

Re: [bug] dpaa2-eth: "Wrong SWA type" and null deref in dpaa2_eth_free_tx_fd()

From: Ioana Ciornei <ioana.ciornei@nxp.com>
Date: 2023-10-04 15:50:59

On Wed, Aug 30, 2023 at 07:10:05PM +0200, Daniel Klauer wrote:
Hi,
Hi Daniel,
while doing Ethernet tests with raw packet sockets on our custom
LX2160A board with Linux v6.1.50 (plus some patches for board support,
but none for dpaa2-eth), I noticed the following crash:
Did you happen to test with any other newer kernel?
[   26.290737] Wrong SWA type
[   26.290760] WARNING: CPU: 7 PID: 0 at drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c:1117 dpaa2_eth_free_tx_fd.isra.0+0x36c/0x380 [fsl_dpaa2_eth]

followed by

[   26.323016] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000028
[   26.324122] Mem abort info:
[   26.324475]   ESR = 0x0000000096000004
[   26.324948]   EC = 0x25: DABT (current EL), IL = 32 bits
[   26.325618]   SET = 0, FnV = 0
[   26.326004]   EA = 0, S1PTW = 0
[   26.326406]   FSC = 0x04: level 0 translation fault
[   26.327021] Data abort info:
[   26.327385]   ISV = 0, ISS = 0x00000004
[   26.327869]   CM = 0, WnR = 0
[   26.328244] user pgtable: 4k pages, 48-bit VAs, pgdp=00000020861cf000
[   26.329055] [0000000000000028] pgd=0000000000000000, p4d=0000000000000000
[   26.329912] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
[   26.330702] Modules linked in: tag_dsa marvell mv88e6xxx aes_ce_blk caam_jr aes_ce_cipher caamhash_desc crct10dif_ce ghash_ce fsl_dpaa2_eth caamalg_desc xhci_plat_hcd sha256_generic gf128mul libsha256 libaes xhci_hcd crypto_engine pcs_lynx sha2_ce sha1_ce usbcore libdes sha256_arm64 cfg80211 dp83867 sha1_generic fsl_mc_dpio xgmac_mdio dpaa2_console dwc3 ahci ahci_qoriq udc_core caam libahci_platform roles error libahci usb_common libata at24 lm90 qoriq_thermal nvmem_layerscape_sfp sfp mdio_i2c
[   26.336237] CPU: 7 PID: 0 Comm: swapper/7 Tainted: G        W          6.1.50-00121-g10168a070f4d #11
[   26.337396] Hardware name: mpxlx2160a (DT)
[   26.337956] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   26.338833] pc : dpaa2_eth_free_tx_fd.isra.0+0xd4/0x380 [fsl_dpaa2_eth]
[   26.339673] lr : dpaa2_eth_free_tx_fd.isra.0+0xb4/0x380 [fsl_dpaa2_eth]
[   26.340512] sp : ffff800008cf3d70
[   26.340931] x29: ffff800008cf3d70 x28: ffff002002900000 x27: 0000000000000000
[   26.341832] x26: 0000000000000001 x25: 0000000000000001 x24: 0000000000000000
[   26.342732] x23: 0000000000002328 x22: ffff002009742728 x21: 00000020884fffc2
[   26.343633] x20: ffff002009740840 x19: ffff0020084fffc2 x18: 0000000000000018
[   26.344534] x17: ffff8026b3a9a000 x16: ffff800008cf0000 x15: fffffffffffed3f8
[   26.345435] x14: 0000000000000000 x13: ffff800008bad028 x12: 0000000000000966
[   26.346335] x11: 0000000000000322 x10: ffff800008c09b58 x9 : ffff800008bad028
[   26.347236] x8 : 0001000000000000 x7 : ffff0020095e6480 x6 : 00000020884fffc2
[   26.348137] x5 : ffff0020095e6480 x4 : 0000000000000000 x3 : 0000000000000000
[   26.349037] x2 : 00000000e7e00000 x1 : 0000000000000001 x0 : 0000000049759e0c
[   26.349938] Call trace:
[   26.350247]  dpaa2_eth_free_tx_fd.isra.0+0xd4/0x380 [fsl_dpaa2_eth]
[   26.351044]  dpaa2_eth_tx_conf+0x84/0xc0 [fsl_dpaa2_eth]
[   26.351720]  dpaa2_eth_poll+0xec/0x3a4 [fsl_dpaa2_eth]
[   26.352375]  __napi_poll+0x34/0x180
[   26.352816]  net_rx_action+0x128/0x2b4
[   26.353290]  _stext+0x124/0x2a0
[   26.353687]  ____do_softirq+0xc/0x14
[   26.354139]  call_on_irq_stack+0x24/0x40
[   26.354635]  do_softirq_own_stack+0x18/0x2c
[   26.355164]  __irq_exit_rcu+0xc4/0xf0
[   26.355628]  irq_exit_rcu+0xc/0x14
[   26.356059]  el1_interrupt+0x34/0x60
[   26.356511]  el1h_64_irq_handler+0x14/0x20
[   26.357028]  el1h_64_irq+0x64/0x68
[   26.357458]  cpuidle_enter_state+0x12c/0x314
[   26.357997]  cpuidle_enter+0x34/0x4c
[   26.358450]  do_idle+0x208/0x270
[   26.358860]  cpu_startup_entry+0x24/0x30
[   26.359356]  secondary_start_kernel+0x128/0x14c
[   26.359928]  __secondary_switched+0x64/0x68
[   26.360460] Code: 7100081f 54000d00 71000c1f 540000c0 (3940a360) 
[   26.361228] ---[ end trace 0000000000000000 ]---

It happens when receiving big Ethernet frames on a AF_PACKET +
SOCK_RAW socket, for example MTU 9000. It does not happen with the
standard MTU 1500. It does not happen when just sending.
Are the transmitted frames also big?
It's 100% reproducible here, however it seems to depend on the data
rate/load: Once it happened after receiving the first 80 frames,
another time after the first 300 frames, etc., and if I only send 5
frames per second, it does not happen at all.

Please let me know if I should provide more info or do more tests. I
can provide a test program if needed.
If you can provide a test program, that would be great. It would help in
reproducing and debugging the issue on my side.

Ioana
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help