Re: flexcan missing error state transitions

From: Andri Yngvason <hidden>
Date: 2017-08-28 11:23:33

Quoting 朱燚 (2017-08-26 05:46:36)

quoted

That sounds rather shaky. Wouldn't it be more robust to have the timer be
periodic? Besides, over-sampling is preferable in order to avoid missed
transitions, right? See diff below.

I can understand why you feel shaky, but I think use periodic timer will not
do meaningful work here - just polling state which do not change.

The reason is flexcan_irq_state() will be called upon any interrupt, that
already guarantees state is up-to-date if there is traffic on the bus, which
is the trigger of any possible state transitions. In this case, any kind of
polling is not required (however, this patch will still do once extra polling,
which I'd like to avoid as well, but seems no easy solution). Extra polling
will get the same state as before if there is no more traffic, and if the
traffic is ongoing, the interrupt will be fired.

The only case above approach cannot handle is there is only singled out node
on the bus, where the traffic cannot continue. Although the core will perform
auto-retransmission, but it will not generate flood of error interrupts,
instead only generate an ACK error (/+WARN) interrupt, where the FLT_CONF is
not up-to-date yet when the flexcan_irq_state() been called, and there is no
trigger to update the state again when the register is changed. That's the
point of this patch to compensate this trigger by poll once at the right time.
Extra polling will get the same state too if the bus is still open, and if the
bus is recovered, the interrupt will be fired again.

Regarding missed transitions, yes, if user chose a too short delay, then he
will miss the error passive transition, that's why the delay is configurable
via DT.

quoted

diff --git a/drivers/net/can/flexcan.c b/drivers/net/can/flexcan.c

...

quoted

+static void flexcan_update_state_timeout(unsigned long data)
+{
+   struct net_device *dev =3D (struct net_device*)data;
+   struct flexcan_priv *priv =3D netdev_priv(dev);
+   struct flexcan_regs __iomem *regs =3D priv->regs;
+   u32 reg_esr;
+
+   /* update state atomically */
+   disable_irq(dev->irq);
+   reg_esr = flexcan_read(&regs->esr);
+   flexcan_irq_state(dev, reg_esr);

You could restart the timer here if the error state is not error-active. Then
you would not have the problem of having to choose an interval so carefully.

Yes, you're right. But I think it is good to think carefully which delay is
really necessary than just drop a randomly number here. With periodic timer,
if very short delay assigned, then the performance will be hit, if reasonable
delay assigned, then unnecessary polling will be done.

quoted

+   enable_irq(dev->irq);
+}

...

I understand that use periodic timer is a more generic approach which even
possible to replace the interrupt driven on broken core (if want to), but I
want to avoid unnecessary polling as much as possible (and still get the
correct state transitions).

What do you think?

Hi Yi,

I have users who can change the baudrate but they can not change the device
tree. The baudrate can even be changed by service technicians. I.e. it's field
configurable. How would I configure the timeout to work well for rates between
125k and 1M?

Do you think that a periodic timer would load the system more than an IRQ-flood
due to a bad error situation?

Best regards,
Andri

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help