Thread (17 messages) 17 messages, 6 authors, 2021-07-26

Re: Random reboots on ODROID-N2+

From: Neil Armstrong <hidden>
Date: 2021-05-18 09:37:50
Also in: linux-amlogic

Hi Stefan,

On 18/05/2021 11:16, Stefan Agner wrote:
Hi Martin,

On 2021-05-17 23:09, Martin Blumenstingl wrote:
quoted
Hi Stefan,

On Mon, May 17, 2021 at 11:14 AM Stefan Agner [off-list ref] wrote:
quoted
Hi,

We are currently testing a new release using Linux 5.10.33. I've
received since several reports of random reboots every couple of days.
Unfortunately the log (journald) doesn't show anything, just a hard cut
at some point.
I'm sorry to hear that some things are not working right

[...]
quoted
[202983.988187] Hardware name: Hardkernel ODROID-N2Plus (DT)
[202983.988188] Call trace:
[202983.988188]  dump_backtrace+0x0/0x1a0
[202983.988189]  show_stack+0x18/0x70
[202983.988190]  dump_stack+0xd0/0x12c
[202983.988190]  panic+0x170/0x338
[202983.988191]  nmi_panic+0x8c/0x90
[202983.988191]  arm64_serror_panic+0x78/0x84
[202983.988192]  do_serror+0x38/0xa0
[202983.988193]  el1_error+0x88/0x108
[202983.988193]  udp_send_skb.isra.0+0x178/0x390
[202983.988194]  udp_sendmsg+0x7c8/0x9c0
[202983.988194]  inet_sendmsg+0x44/0x70
[202983.988195]  sock_sendmsg+0x4c/0x60
[202983.988196]  __sys_sendto+0xd0/0x140
[202983.988196]  __arm64_sys_sendto+0x28/0x40
[202983.988197]  el0_svc_common.constprop.0+0x78/0x1a0
[202983.988197]  do_el0_svc+0x24/0x90
[202983.988198]  el0_svc+0x14/0x20
[202983.988199]  el0_sync_handler+0xb0/0xc0
[202983.988199]  el0_sync+0x178/0x180
[202983.988211] SMP: stopping secondary CPUs
[202983.988212] Kernel Offset: disabled
[202983.988212] CPU features: 0x0240002,61082004
[202983.988213] Memory Limit: none
that looks weird
quoted
Anyone observed such an issue? I am pretty sure that this is a new issue
as we have many installations using Linux 5.9.16 running stable on the
same hardware,.
I haven't but I am currently trying to hunt down a (probably
unrelated) Ethernet issue on an older Meson8m2 SoC currently.
All Amlogic Meson SoCs use a DWMAC IP for Ethernet connectivity plus
there's a little bit of "glue" IP for the xMII connecting to the SoC's
IO pads

I think it's a good idea to involve the netdev and (probably even more
important) stmmac maintainers.
Anything skb related is handled by the stmmac driver.
So I am hoping that someone with expertise in that area can give any
hints for debugging or reproducing this.
Ok I'll do that, I currently wait to see the same trace a second time,
just to make sure its really caused by that code path always.
A good work would be to eventually do a bisect between the last known working and
the currently working version.

SError Interrupt looks like an HW issue caused by a change in v5.10

Neil
--
Stefan

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help