Thread (17 messages) 17 messages, 6 authors, 2021-07-26

Re: Random reboots on ODROID-N2+

From: Stefan Agner <stefan@agner.ch>
Date: 2021-05-18 10:23:02
Also in: linux-amlogic

On 2021-05-18 03:33, Andrew Lunn wrote:
On Mon, May 17, 2021 at 11:14:18AM +0200, Stefan Agner wrote:
quoted
Hi,

We are currently testing a new release using Linux 5.10.33. I've
received since several reports of random reboots every couple of days.
Unfortunately the log (journald) doesn't show anything, just a hard cut
at some point.

After running serial console on several instances, I was able to catch
this stack trace:

[202983.988153] SError Interrupt on CPU3, code 0xbf000000 -- SError
[202983.988155] CPU: 3 PID: 3463 Comm: mdns-repeater Not tainted 5.10.33
#1
[202983.988156] Hardware name: Hardkernel ODROID-N2Plus (DT)
[202983.988157] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
[202983.988158] pc : udp_send_skb.isra.0+0x178/0x390
[202983.988159] lr : udp_send_skb.isra.0+0x130/0x390
Hi Stefan
Hi Andrew,
Could you generate net/ipv4/udp.lst so we can see what
udp_send_skb.isra.0+0x178/0x390 is trying to do, and what bit of C
code it maps to.
Ok, built net/ipv4/udp.lst using the same build environment (buildroot)
the kernel which generated the stack trace has been built with, so I
think this should add up:

ffff800010c1bb60 <udp_send_skb.isra.0>:
static int udp_send_skb(struct sk_buff *skb, struct flowi4 *fl4,
...
                udp4_hwcsum(skb, fl4->saddr, fl4->daddr);
ffff800010c1bc78:       29450ae1        ldp     w1, w2, [x23, #40]
ffff800010c1bc7c:       aa1303e0        mov     x0, x19
ffff800010c1bc80:       94000000        bl      ffff800010c184b0
<udp4_hwcsum>
                        ffff800010c1bc80: R_AARCH64_CALL26     
udp4_hwcsum
        err = ip_send_skb(sock_net(sk), skb);
ffff800010c1bc84:       f9401ac0        ldr     x0, [x22, #48]
ffff800010c1bc88:       aa1303e1        mov     x1, x19
ffff800010c1bc8c:       94000000        bl      0 <ip_send_skb>
                        ffff800010c1bc8c: R_AARCH64_CALL26     
ip_send_skb
        if (err) {
ffff800010c1bc90:       350008e0        cbnz    w0, ffff800010c1bdac
<udp_send_skb.isra.0+0x24c>
...
        u64 pc = READ_ONCE(ti->preempt_count);
ffff800010c1bcd4:       f9400820        ldr     x0, [x1, #16]
        WRITE_ONCE(ti->preempt.count, --pc);
ffff800010c1bcd8:       d1000400        sub     x0, x0, #0x1
ffff800010c1bcdc:       b9001020        str     w0, [x1, #16]
        return !pc || !READ_ONCE(ti->preempt_count);
...

The full udp.lst file:
https://drive.google.com/file/d/1j0RKOfuMXmCRWILpkG3uk_beohWrr-ho/view?usp=sharing

Since I only have this one trace, I am not 100% if this trace is just a
random one or always the case.

But things seem to add up to me: mdns-repeater deals with UDP packets,
and the it seems that the code tries to make use of HW check-summing
(from lr)? This would explain why this platform only shows the problem.

--
Stefan

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help