Re: Regression with suspicious RCU usage splats with cpu_pm change
From: Tony Lindgren <tony@atomide.com>
Date: 2017-07-19 08:01:34
Also in:
linux-omap, lkml
* Paul E. McKenney [off-list ref] [170718 10:00]:
On Mon, Jul 17, 2017 at 10:41:38PM -0700, Tony Lindgren wrote:quoted
* Paul E. McKenney [off-list ref] [170717 05:40]:quoted
On Sun, Jul 16, 2017 at 11:08:07PM -0700, Tony Lindgren wrote:quoted
* Alex Shi [off-list ref] [170716 16:25]:quoted
I reused the rcu_irq_enter_irqson() from RCU_NONIDLE to avoid this issue. It works fine. Tony, Could you like to give a tested-by if this patch works for you.Yeah that keeps things booting for me with no splats so: Tested-by: Tony Lindgren <tony@atomide.com> In general, it seems we're missing the knowledge in Linux kernel of when the entire system is idle. Right now it seems that only cpuidle_coupled knows that? We could probably simplify things by adding some PM state for entire system idle. Then cpuidle code and timer code could use that to test when it's safe to do whatever the SoC needs to do to enter deeper power states. If we already have something like that, please do let me know :)Well, we used to have CONFIG_NO_HZ_FULL_SYSIDLE, which detected full-system idle lazily so as to avoid scalability bottlenecks. https://lwn.net/Articles/558284/ No one was using it, so I removed it last merge window. The patch that removed it is at sysidle.2017.05.11a, which can probably still be reverted cleanly. Or just use v4.11 or earlier.OK thanks for the pointer, for reference that commit is fe5ac724d81a ("rcu: Remove nohz_full full-system-idle state machine"). For a potential user, I think we could use it for example in cpuidle_enter_state_coupled() + omap_enter_idle_coupled() where we try to figure out if the system is fully idle before calling tick_broadcast_enter().Would you be willing to prototype your usage on v4.12? It still has NO_HZ_FULL_SYSIDLE. You have to enable NO_HZ_FULL in order to enable NO_HZ_FULL_SYSIDLE at the moment. Either way, here is the important bit for usage: bool rcu_sys_is_idle(void); void rcu_sysidle_force_exit(void); The rcu_sys_is_idle() function returns true if all CPUs other than the time-keeping CPU (that is, tick_do_timer_cpu, which is usually CPU 0) are in their idle loop. Of course, if you invoke rcu_sys_is_idle() from any CPU other than the time-keeping CPU, you will automatically get a return value of false. RCU's idle-exit code already sets state appropriately, but if there is some other circumstance where you need to force the state machine out of all-CPUs-idle state, you can call rcu_sysidle_force_exit().
OK sure I'll take a look at some point. Thanks, Tony