Thread (5 messages) 5 messages, 3 authors, 2021-12-03

Re: cpuidle on i.MX8MQ

From: Abel Vesa <hidden>
Date: 2021-12-03 08:40:13

On 21-11-29 14:40:04, Martin Kepplinger wrote:
Am Donnerstag, dem 04.11.2021 um 13:04 +0200 schrieb Abel Vesa:
quoted
On 21-11-03 13:09:15, Martin Kepplinger wrote:
quoted
Am Dienstag, dem 02.11.2021 um 11:55 +0100 schrieb Alexander Stein:
quoted
Hello,

I was hit by the errata e11171 on imx8mq on our custom board. I
found
[1] from over 2 years ago, and the even older patchset [2].
Is there some final conclusion or fix regarding this errata? From
what
I understand the proposed change is apparently not acceptable in
mainline for several reasons. I'm wondering what's the current
status.
Unfortunately, there is not gonna be an upstream solution for this
errata. Long story short, the SOC is missing wakeup lines from gic
to gpc. This means the IPIs are affected. So, knowing all that,
in order to wake up a core, you need to write a bit in some register
in gpc. The SW workaround (non upstreamable) I provided does exactly
that by hijacking the gic_raise_softirq __smp_cross_call handler and
registers a wrapper over it which also calls into ATF (using SIP)
and wakes up that specific core by writing into the gpc register.

There is no other possible way to wake up a core on 8MQ.
quoted
quoted
As suggested at that time, the only solution (right now) is to
disable
cpuidle on imx8mq?
Yes, the vendor actually suggests that, but you can use the mentioned
hack.
quoted
quoted
Best regards,
Alexander

[1] https://lkml.org/lkml/2019/6/10/350
[2] https://lkml.org/lkml/2019/3/27/542
Hi Alexander, hi Abel,

At this point my understanding is basically the same. We carry (a
slight variation of) the above in our tree ever since in oder to
have
the cpu-sleep sleep state. Not using it is not acceptable to us :)

Until now there's one internal API change we need to revert (bring
back) in order for this to work. For reference, this is our current
implementation:

https://source.puri.sm/martin.kepplinger/linux-next/-/compare/0b90c3622755e0155632d8cc25edd4eb7f875968...ce4803745a180adc8d87891d4ff8dff1c7bd5464

Abel, can you still say that, in case this solution won't apply
anymore
in the future, that you would be available to create an update?
I'll try to find a workaround soon, based on the same general idea
behind the current one you guys are using. I'll do this in my own
time
since the company does not allocate resources for 8MQ cpuidle support
anymore.
quoted
Can you even imagine a possibly acceptable solution for mainline to
this? Nothing is completely set in stone with Linux :)
I believe Marc was pretty clear about not accepting such a workaround
(and, TBH, it makes perfect sense not to).

Since I don't think there is any other way that would go around the
gic driver, I believe this has hit an end when it comes to upstream
support.

Sorry about that.

I'm open to any suggestions though.
hi Abel, since there's the link to the workaround implementation here,
I'd like to show you a bug when transitioning to s2idle. I don't see
that when removing all these cpu-sleep additions (linked above). (I
might send as a seperate bugreport later)

Can you see how that can cause this rcu stall? it looks like a problem
with a timer...
Looks to me like some core is not getting up. 

You can start by hacking the irqchip driver to see if the ATF call
is still being made after s2idle is triggered. If yes, then make
sure the ATF is writing the GPC reg to wake up that specific core.

That's usually what's going wrong with this workaround.
 65.476456] rcu: INFO: rcu_preempt self-detected stall on CPU
[ 65.476615] rcu: 0-...!: (1 ticks this GP)
idle=42f/1/0x4000000000000004 softirq=9151/9151 fqs=0 
[ 65.476676] (t=8974 jiffies g=11565 q=2)
[ 65.476703] rcu: rcu_preempt kthread timer wakeup didn't happen for
8973 jiffies! g11565 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
[ 65.476715] rcu: Possible timer handling issue on cpu=0 timer-
softirq=2032
[ 65.476730] rcu: rcu_preempt kthread starved for 8974 jiffies! g11565
f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0
[ 65.476742] rcu: Unless rcu_preempt kthread gets sufficient CPU time,
OOM is now expected behavior.
[ 65.476749] rcu: RCU grace-period kthread stack dump:
[ 65.476764] task:rcu_preempt state:I stack: 0 pid: 13 ppid: 2
flags:0x00000008
[ 65.476814] Call trace:
[ 65.476825] __switch_to+0x138/0x190
[ 65.476975] __schedule+0x288/0x6ec
[ 65.477044] schedule+0x7c/0x110
[ 65.477059] schedule_timeout+0xa4/0x1c4
[ 65.477085] rcu_gp_fqs_loop+0x13c/0x51c
[ 65.477126] rcu_gp_kthread+0x1a4/0x264
[ 65.477136] kthread+0x15c/0x170
[ 65.477167] ret_from_fork+0x10/0x20
[ 65.477186] rcu: Stack dump where RCU GP kthread last ran:
[ 65.477194] Task dump for CPU 0:
[ 65.477202] task:swapper/0 state:R running task stack: 0 pid: 0 ppid:
0 flags:0x0000000a
[ 65.477223] Call trace:
[ 65.477226] dump_backtrace+0x0/0x1e4
[ 65.477246] show_stack+0x24/0x30
[ 65.477256] sched_show_task+0x15c/0x180
[ 65.477293] dump_cpu_task+0x50/0x60
[ 65.477327] rcu_check_gp_kthread_starvation+0x128/0x148
[ 65.477335] rcu_sched_clock_irq+0xb74/0xf04
[ 65.477348] update_process_times+0xa8/0xf4
[ 65.477388] tick_sched_handle+0x3c/0x60
[ 65.477409] tick_sched_timer+0x58/0xb0
[ 65.477416] __hrtimer_run_queues+0x18c/0x370
[ 65.477428] hrtimer_interrupt+0xf4/0x250
[ 65.477437] arch_timer_handler_phys+0x40/0x50
[ 65.477477] handle_percpu_devid_irq+0x94/0x250
[ 65.477505] handle_domain_irq+0x6c/0xa0
[ 65.477516] gic_handle_irq+0xc4/0x144
[ 65.477527] call_on_irq_stack+0x2c/0x54
[ 65.477534] do_interrupt_handler+0x5c/0x70
[ 65.477544] el1_interrupt+0x30/0x80
[ 65.477556] el1h_64_irq_handler+0x18/0x24
[ 65.477567] el1h_64_irq+0x78/0x7c
[ 65.477575] cpuidle_enter_s2idle+0x14c/0x1ac
[ 65.477617] do_idle+0x25c/0x2a0
[ 65.477644] cpu_startup_entry+0x30/0x80
[ 65.477656] rest_init+0xec/0x100
[ 65.477666] arch_call_rest_init+0x1c/0x28
[ 65.477700] start_kernel+0x6e0/0x720
[ 65.477709] __primary_switched+0xc0/0xc8
[ 65.477751] Task dump for CPU 0:
[ 65.477757] task:swapper/0 state:R running task stack: 0 pid: 0 ppid:
0 flags:0x0000000a
[ 65.477770] Call trace:
[ 65.477773] dump_backtrace+0x0/0x1e4
[ 65.477788] show_stack+0x24/0x30
[ 65.477796] sched_show_task+0x15c/0x180
[ 65.477804] dump_cpu_task+0x50/0x60
[ 65.477812] rcu_dump_cpu_stacks+0xf4/0x138
[ 65.477820] rcu_sched_clock_irq+0xb78/0xf04
[ 65.477829] update_process_times+0xa8/0xf4
[ 65.477838] tick_sched_handle+0x3c/0x60
[ 65.477845] tick_sched_timer+0x58/0xb0
[ 65.477854] __hrtimer_run_queues+0x18c/0x370
[ 65.477863] hrtimer_interrupt+0xf4/0x250
[ 65.477873] arch_timer_handler_phys+0x40/0x50
[ 65.477880] handle_percpu_devid_irq+0x94/0x250
[ 65.477888] handle_domain_irq+0x6c/0xa0
[ 65.477897] gic_handle_irq+0xc4/0x144
[ 65.477903] call_on_irq_stack+0x2c/0x54
[ 65.477910] do_interrupt_handler+0x5c/0x70
[ 65.477921] el1_interrupt+0x30/0x80
[ 65.477929] el1h_64_irq_handler+0x18/0x24
[ 65.477937] el1h_64_irq+0x78/0x7c
[ 65.477944] cpuidle_enter_s2idle+0x14c/0x1ac
[ 65.477952] do_idle+0x25c/0x2a0
[ 65.477959] cpu_startup_entry+0x30/0x80
[ 65.477970] rest_init+0xec/0x100
[ 65.477977] arch_call_rest_init+0x1c/0x28
[ 65.477988] start_kernel+0x6e0/0x720
[ 65.477995] __primary_switched+0xc0/0xc8
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help