Re: m_can: a lot of 'Rx FIFO 0 Message Lost' in dmesg
From: Torin Cooper-Bennun <hidden>
Date: 2021-02-26 13:37:48
On Wed, Feb 24, 2021 at 02:27:28PM +0000, Mariusz Madej wrote:
Hi, I have a problem with m_can controller in my sama5d2 processor. Under heavy can traffic it happens that my device starts to report (dmesg): [ 77.610000] m_can_platform f8054000.can can0: Rx FIFO 0 Message Lost [ 77.620000] m_can_platform f8054000.can can0: Rx FIFO 0 Message Lost [ 77.630000] m_can_platform f8054000.can can0: Rx FIFO 0 Message Lost [ 77.630000] m_can_platform f8054000.can can0: Rx FIFO 0 Message Lost [ 77.640000] m_can_platform f8054000.can can0: Rx FIFO 0 Message Lost [ 77.640000] m_can_platform f8054000.can can0: Rx FIFO 0 Message Lost [ 77.650000] m_can_platform f8054000.can can0: Rx FIFO 0 Message Lost [ 77.660000] m_can_platform f8054000.can can0: Rx FIFO 0 Message Lost [ 77.660000] m_can_platform f8054000.can can0: Rx FIFO 0 Message Lost what causes large load problem in my system.
How heavy is this traffic? Is the bus operating at a very high bitrate? Are there any other useful lines in dmesg?
The only place in m_can.c file, where interrupt register is cleared is function
called when interrupt arrives
static irqreturn_t m_can_isr(int irq, void *dev_id)
{
.
.
/* ACK all irqs */
if (ir & IR_ALL_INT)
m_can_write(cdev, M_CAN_IR, ir);
.
.
}
But when we enter 'NAPI mode' in heavy load we are never get to this function
until load gets lower and interrupts are enabled again. In this situation,
this code:The m_can driver handles the IRQ by offloading the RX to a NAPI queue, so the RX procedure is deferred, and is scheduled to happen at a (slightly) later time. As far as I understand it, interrupts are not disabled at any point.
That is why we got so many messages in a row for so long time. So clearing RXFS_RFL bit after warning is issued could be a solution.
RXFS_RFL is a flag in a status register, not an interrupt flag. There is a corresponding interrupt flag, but that is cleared along with the rest, at the top of m_can_isr. I think you are losing messages because the traffic is too heavy for your system to read out the messages fast enough. That is the usual reason for seeing "Rx FIFO 0 Message Lost". -- Regards, Torin Cooper-Bennun Software Engineer | maxiluxsystems.com