Thread (7 messages) 7 messages, 3 authors, 2025-09-29

Re: Possible race condition of the rockchip_canfd driver

From: Marc Kleine-Budde <mkl@pengutronix.de>
Date: 2025-09-22 08:50:36
Also in: linux-arm-kernel, linux-can, linux-rockchip, lkml

On 20.09.2025 18:08:03, Andrea Daoud wrote:
quoted
On 18.09.2025 20:58:33, Andrea Daoud wrote:
quoted
I'm using the rockchip_canfd driver on an RK3568. When under high bus
load, I get
the following logs [1] in rkcanfd_tx_tail_is_eff, and the CAN bus is unable to
communicate properly under this condition. The exact cause is currently not
entirely clear, and it's not reliably reproducible.
Our customer is using a v3 silicon revision of the chip, which doesn't
this workaround.
Could you please let me know how to check whether my RK3568 is v2 or v3?
Alexander Shiyan (Cc'ed) reads the information from an nvmem cell:

| https://github.com/MacroGroup/barebox/blob/macro/arch/arm/boards/diasom-rk3568/board.c#L239-L257

The idea is to fixup the device tree in the bootloader depending on the
SoC revision, so that the CAN driver uses only the needed workarounds.
quoted
quoted
In the logs we can spot some strange points:

1. Line 24, tx_head == tx_tail. This should have been rejected by the if
(!rkcanfd_get_tx_pending) clause.

2. Line 26, the last bit of priv->tx_tail (0x0185dbb3) is 1. This means that the
tx_tail should be 1, because rkcanfd_get_tx_tail is essentially mod the
priv->tx_tail by two. But the printed tx_tail is 0.

I believe these problems could mean that the code is suffering from some race
condition. It seems that, in the whole IRQ processing chain of the driver,
there's no lock protection. Maybe some IRQ happens within the execution of
rkcanfd_tx_tail_is_eff, and touches the state of the tx_head and tx_tail?

Could you please have a look at the code, and check if some locking is needed?
My time for community support is currently a bit limited. I think this
has to wait a bit, apologies :/
No worries, I will debug myself, and hopefully send a PR if I found
something out.
Great, I have a both a v2 and a v3 SoC here to test.

regards,
Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde          |
Embedded Linux                   | https://www.pengutronix.de |
Vertretung Nürnberg              | Phone: +49-5121-206917-129 |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-9   |

Attachments

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help