Thread (3 messages) 3 messages, 2 authors, 2018-08-20

RE: Kernel locks after consecutive writes returning ENOBUFS over can0

From: Ermans, Brian (Ext.) <hidden>
Date: 2018-08-20 17:09:53

Hello Julia,

Thank you for your reply.
quoted
Changing thread priority or application priority does not change
behavior.
Is this true even if ksoftirqd/1 is prioritized over the CAN thread?
It took us some time, but good call, this actually seems to fix the issue.

When we force ksoftirqd to FIFO and the CAN thread to NORMAL it does not lock! (or at least, not within a reasonable time frame, will do some long-term testing tonight)
We will have to do extensive testing to verify this is a valid work-around for us. We don't know if changing these priorities have unforeseen side-effects at this point, but for now it seems to be good.

root@OpenWrt:~# ps | grep ksoftirq
    3 root         0 SW   [ksoftirqd/0]
   19 root         0 SW   [ksoftirqd/1]
 2615 root      3040 S    grep ksoftirq
root@OpenWrt:~# chrt -f -p 98 3
pid 3's current scheduling policy: SCHED_OTHER
pid 3's current scheduling priority: 0
pid 3's new scheduling policy: SCHED_FIFO
pid 3's new scheduling priority: 98
root@OpenWrt:~# chrt -f -p 98 19
pid 19's current scheduling policy: SCHED_OTHER
pid 19's current scheduling priority: 0
pid 19's new scheduling policy: SCHED_FIFO
pid 19's new scheduling priority: 98


Another thing I mentioned to you in our private conversation, that probably is interesting for anyone reading along:
We have confirmed this issue does not occur when using a mPCIe device from Peak. User space code is the same as we still write to can0.
I failed to mention this before as we didn't test this yet at the time I wrote the original mail.

Please let me know if you need more information.

Thank you,

Brian
quoted
We are experiencing issues with RT Linux (4.14.59-rt37) as we have
found a re-occurring lock in the kernel. This issue has also been
observed in 4.8.6-rt5.

Our setup consists of an Atom dual-core (/proc/cpuinfo below) running
open-wrt in combination with a Peak CAN usb dongle.  This CAN dongle
is interfaced via SocketCAN and written to with high frequency (code
below). This code runs from a dedicated thread. This thread is the
only place we access the CAN interface (can0).
Is this Peak CAN USB dongle driver in the kernel tree somewhere?  Or is
it an out-of-tree driver?
This CAN USB dongle driver is part of mainline kernel:

https://github.com/torvalds/linux/blob/master/drivers/net/can/usb/peak_usb/pcan_usb_core.c
quoted
Sometimes, a write to this can0 interface results in a lock. This only
occurs after a write resulting in ENOBUFS. Our code re-attempts to
write after a ENOBUFS and the time between the first attempt and the
second one influences the rate of this lock occurring. If we
usleep(100) our lock happens very fast (~2seconds), usleep(1000) can
take up to a minute and we haven't observed the issue with a usleep of
3000. Having the usleep as short as possible is critical for our
application. We have only observed locking of the CAN thread on cpu1.
That's concerning.
quoted
This lock results in the core being fully used by the CAN thread and
the systems becoming unresponsive. Sometimes this is detected by the
kernel, after which a call stack is produced (two callstacks below).
Just to check: you're not seeing this CAN thread return back to
usermode, correct?  It remains "stuck" spinning in the write() in kernel
context?
Correct, the thread does not return at all. Pidstat -t shows continuous increasing system usage for the CAN thread and CAN thread alone.
quoted
Please be aware that, due to application limitations, we have our
txqueuelen configured as 1. This results in relative frequent ENOBUFS.

Changing thread priority or application priority does not change
behavior.
Is this true even if ksoftirqd/1 is prioritized over the CAN thread?
quoted
CAN related settings changes like bitrate, CAN2.0 or
CAN-FD, frame size, cable length or busload also do not improve the
situation.
quoted
We have made traces (using trace-cmd record -e 'sched_wakeup*' -e
sched_switch -e 'sched_migrate*' ) during the lock-up and it seems as
if a ksoftirqd is sched_wakeup, but never actually switched to right
before the lock occurs:
This is what I would expect if your CAN thread is spinning at an RT
priority greater that ksoftirqd/1.  It's not clear from your traces what
priority/class your CAN thread is executing as.
quoted
             CAN-2434  [001]   116.127166: sched_wakeup:         ksoftirqd/1:21
[120] success=1 CPU:001
quoted
       RtFdbDisp-2133  [000]   116.127175: sched_switch:
RtFdbDisp:2133 [120] S ==> CFpgaEventColle:2103 [120]
quoted
 CFpgaEventColle-2103  [000]   116.127179: sched_wakeup:
ScmIfm:2112 [22] success=1 CPU:000
quoted
 CFpgaEventColle-2103  [000]   116.127182: sched_switch:
CFpgaEventColle:2103 [120] R ==> ScmIfm:2112 [22]
quoted
          ScmIfm-2112  [000]   116.127190: sched_switch:         ScmIfm:2112
[22] S ==> CFpgaEventColle:2103 [120]
quoted
 CFpgaEventColle-2103  [000]   116.127195: sched_wakeup:
ScmIfm:2112 [22] success=1 CPU:000
quoted
 CFpgaEventColle-2103  [000]   116.127198: sched_switch:
CFpgaEventColle:2103 [120] R ==> ScmIfm:2112 [22]
quoted
          ScmIfm-2112  [000]   116.127207: sched_wakeup:
MsgHndr4:2111 [25] success=1 CPU:000
quoted
          ScmIfm-2112  [000]   116.127214: sched_switch:         ScmIfm:2112
[22] S ==> MsgHndr4:2111 [25]
quoted
        MsgHndr4-2111  [000]   116.127224: sched_wakeup:
RtFdbDisp:2133 [120] success=1 CPU:000
quoted
        MsgHndr4-2111  [000]   116.127245: sched_switch:
MsgHndr4:2111 [25] S ==> RtFdbDisp:2133 [120]
quoted
       RtFdbDisp-2133  [000]   116.127255: sched_switch:
RtFdbDisp:2133 [120] S ==> CFpgaEventColle:2103 [120]
quoted
 CFpgaEventColle-2103  [000]   116.127263: sched_switch:
CFpgaEventColle:2103 [120] S ==> trace-cmd:2432 [120]
quoted
       trace-cmd-2432  [000]   116.127331: sched_switch:         trace-
cmd:2432 [120] S ==> swapper/0:0 [120]
quoted
             CAN-2434  [001]   116.127604: sched_wakeup:
ktimersoftd/1:20 [98] success=1 CPU:001
quoted
             <idle>-0  [000]   116.128849: sched_wakeup:         ktimersoftd/0:8
[98] success=1 CPU:000
quoted
             <idle>-0  [000]   116.128863: sched_switch:         swapper/0:0
[120] R ==> ktimersoftd/0:8 [98]
quoted
      ktimersoftd/0-8  [000]   116.128897: sched_switch:
ktimersoftd/0:8 [98] S ==> swapper/0:0 [120]
quoted
             <idle>-0  [000]   116.130147: sched_wakeup:         ktimersoftd/0:8
[98] success=1 CPU:000
quoted
             <idle>-0  [000]   116.130157: sched_switch:         swapper/0:0
[120] R ==> ktimersoftd/0:8 [98]
quoted
      ktimersoftd/0-8  [000]   116.130180: sched_wakeup:
Net_RJ45:2150 [120] success=1 CPU:000
quoted
and so on. From this point on our CAN thread "locks". It does however
perform some scheduling. The lines below are the only occurrences
within this particular trace where "CAN-2434" is mentioned after the
lock(note the difference in timestamps).

       CAN-2434  [001]   116.127604: sched_wakeup:
ktimersoftd/1:20 [98] success=1 CPU:001
quoted
       CAN-2434  [001]   116.145468: sched_migrate_task:        comm=tQLive
pid=2148 prio=23 orig_cpu=1 dest_cpu=0
quoted
       CAN-2434  [001]   116.145472: sched_wakeup:              tQLive:2148
[23] success=1 CPU:000
quoted
       CAN-2434  [001]   116.186470: sched_migrate_task:
comm=tQCmd pid=2147 prio=23 orig_cpu=1 dest_cpu=0
quoted
       CAN-2434  [001]   116.186475: sched_wakeup:              tQCmd:2147
[23] success=1 CPU:000
quoted
       CAN-2434  [001]   116.228357: sched_migrate_task:
comm=SockSend pid=2347 prio=33 orig_cpu=1 dest_cpu=0
quoted
       CAN-2434  [001]   116.418810: sched_wakeup:              rcuc/1:19 [120]
success=1 CPU:001

If the CAN thread is stuck spinning on CPU1 and we wakeup an unpinned,
lower priority RT task on CPU1 (making this CPU "overloaded"), the
scheduler will push tasks around in the wakeup path, like we're seeing
here.  In other words, nothing about this seems overtly wrong, just a
symptom of the CAN thread spinning at a high priority.
quoted
The next mention of ksoftirqd in this particular trace is here, way later, in a
new sched_wakeup. Here the sched_switch is actually made:
quoted
        <idle>-0 [000]   116.168849: sched_wakeup:         ksoftirqd/0:7 [120]
success=1 CPU:000
quoted
        <idle>-0 [000]   116.168858: sched_wakeup:         ktimersoftd/0:8 [98]
success=1 CPU:000
quoted
        <idle>-0 [000]   116.168869: sched_switch:         swapper/0:0 [120] R
==> ktimersoftd/0:8 [98]
quoted
 ktimersoftd/0-8 [000]   116.168895: sched_switch:         ktimersoftd/0:8
[98] S ==> ksoftirqd/0:7 [120]
quoted
   ksoftirqd/0-7 [000]   116.168905: sched_switch:         ksoftirqd/0:7 [120]
S ==> swapper/0:0 [120]

This is all on CPU0, though; CPU1 is probably still spinning away. :\
quoted
I can send this full trace and others with this same issue on request.
Having a complete trace is probably going to be more helpful.  It might
be helpful also to also trace more than just the 'sched' class of trace
events.  Namely, also include the 'irq' family of events, 'timer' and
'net' would probably be helpful as well, given the issue you're seeing
appears to be in the packet tx stack (more below).
I will send you a separate mail with traces, including ones made with more trace events.
...
HBM Netherlands B.V., Schutweg 15a, NL-5145 NP Waalwijk, The Netherlands | www.hbm.com

Registered as B.V. (Dutch limited liability company) in the Dutch commercial register 08183075 0000
Company domiciled in Waalwijk | Managing Directors : Ben Keetman, Peter Ackermans

The information in this email is confidential. It is intended solely for the addressee. If you are not the intended recipient, please let me know and delete this email.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help