Re: [PATCH 2/2] PM: sleep: Fix runtime PM based cpuidle support
From: Ulf Hansson <hidden>
Date: 2021-10-21 18:12:36
Also in:
linux-pm, lkml
On Thu, 21 Oct 2021 at 18:33, Rafael J. Wysocki [off-list ref] wrote:
On Thu, Oct 21, 2021 at 6:17 PM Ulf Hansson [off-list ref] wrote:quoted
On Thu, 21 Oct 2021 at 17:09, Rafael J. Wysocki [off-list ref] wrote:quoted
On Thu, Oct 21, 2021 at 4:05 PM Ulf Hansson [off-list ref] wrote:quoted
On Thu, 21 Oct 2021 at 15:45, Rafael J. Wysocki [off-list ref] wrote:quoted
On Thu, Oct 21, 2021 at 1:49 PM Ulf Hansson [off-list ref] wrote:quoted
On Wed, 20 Oct 2021 at 20:18, Rafael J. Wysocki [off-list ref] wrote:quoted
On Wed, Sep 29, 2021 at 4:44 PM Ulf Hansson [off-list ref] wrote:quoted
In the cpuidle-psci case, runtime PM in combination with the generic PM domain (genpd), may be used when entering/exiting an idlestate. More precisely, genpd relies on runtime PM to be enabled for the attached device (in this case it belongs to a CPU), to properly manage the reference counting of its PM domain. This works fine most of the time, but during system suspend in the dpm_suspend_late() phase, the PM core disables runtime PM for all devices. Beyond this point and until runtime PM becomes re-enabled in the dpm_resume_early() phase, calls to pm_runtime_get|put*() will fail. To make sure the reference counting in genpd becomes correct, we need to prevent cpuidle-psci from using runtime PM when it has been disabled for the device. Therefore, let's move the call to cpuidle_pause() from dpm_suspend_noirq() to dpm_suspend_late() - and cpuidle_resume() from dpm_resume_noirq() into dpm_resume_early(). Diagnosed-by: Maulik Shah [off-list ref] Suggested-by: Maulik Shah <redacted> Signed-off-by: Ulf Hansson <redacted> --- drivers/base/power/main.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c index cbea78e79f3d..1c753b651272 100644 --- a/drivers/base/power/main.c +++ b/drivers/base/power/main.c@@ -747,8 +747,6 @@ void dpm_resume_noirq(pm_message_t state) resume_device_irqs(); device_wakeup_disarm_wake_irqs(); - - cpuidle_resume(); } /**@@ -870,6 +868,7 @@ void dpm_resume_early(pm_message_t state) } mutex_unlock(&dpm_list_mtx); async_synchronize_full(); + cpuidle_resume(); dpm_show_time(starttime, state, 0, "early"); trace_suspend_resume(TPS("dpm_resume_early"), state.event, false); }@@ -1336,8 +1335,6 @@ int dpm_suspend_noirq(pm_message_t state) { int ret; - cpuidle_pause(); - device_wakeup_arm_wake_irqs(); suspend_device_irqs();@@ -1467,6 +1464,7 @@ int dpm_suspend_late(pm_message_t state) int error = 0; trace_suspend_resume(TPS("dpm_suspend_late"), state.event, true); + cpuidle_pause(); mutex_lock(&dpm_list_mtx); pm_transition = state; async_error = 0; --Well, this is somewhat heavy-handed and it affects even the systems that don't really need to pause cpuidle at all in the suspend path.Yes, I agree. Although, I am not really changing the behaviour in regards to this. cpuidle_pause() is already being called in dpm_suspend_noirq(), for everybody today.Yes, it is, but pausing it earlier will cause more energy to be spent, potentially. That said, there are not too many users of suspend_late callbacks in the tree, so it may not matter too much.quoted
quoted
Also, IIUC you don't need to pause cpuidle completely, but make it temporarily avoid idle states potentially affected by this issue. An additional CPUIDLE_STATE_DISABLED_ flag could be used for that I suppose and it could be set via cpuidle_suspend() called from the core next to cpufreq_suspend().cpuidle_suspend() would then need to go and fetch the cpuidle driver instance, which in some cases is one driver per CPU. Doesn't that get rather messy?Per-CPU variables are used for that, so it is quite straightforward.quoted
Additionally, since find_deepest_state() is being called for cpuidle_enter_s2idle() too, we would need to treat the new CPUIDLE_STATE_DISABLED_ flag in a special way, right?No, it already checks "disabled".Yes, but that would be wrong.Hmmm.quoted
The use case I want to support, for cpuidle-psci, is to allow all idle states in suspend-to-idle,So does PM-runtime work in suspend-to-idle? How?No it doesn't. See below.quoted
quoted
but prevent those that rely on runtime PM (after it has been disabled) for the regular idle path.Do you have a special suspend-to-idle handling of those states that doesn't require PM-runtime?Yes. Feel free to have a look in __psci_enter_domain_idle_state().So in theory you could check the pm_runtime_put_sync_suspend() return value and fall back to something like WFI if that's an error code.
I have already tried that, but it simply got too complicated. The main issue was that runtime PM could become disabled for the device in the middle of executing the ->enter() callback. For example, if pm_runtime_get_sync() fails, I still need to make sure the reference counting in genpd becomes correct - and I can't do that using dev_pm_genpd_resume(). That's because it's not designed to be called in this "unknown" suspend phase, but should be called after the noirq phase and be properly balanced with dev_pm_genpd_suspend(). In other words, the error path didn't work out for me. [...] Kind regards Uffe _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel