Thread (83 messages) 83 messages, 4 authors, 2021-10-19

Re: [PATCH v13 13/35] drm/tegra: gr2d: Support generic power domain and runtime PM

From: Ulf Hansson <hidden>
Date: 2021-10-05 08:46:43
Also in: dri-devel, linux-clk, linux-mmc, linux-pm, linux-pwm, linux-staging, linux-tegra, linux-usb, lkml

On Mon, 4 Oct 2021 at 17:57, Dmitry Osipenko [off-list ref] wrote:
04.10.2021 14:01, Ulf Hansson пишет:
quoted
On Fri, 1 Oct 2021 at 21:00, Dmitry Osipenko [off-list ref] wrote:
quoted
01.10.2021 17:55, Ulf Hansson пишет:
quoted
On Fri, 1 Oct 2021 at 16:29, Dmitry Osipenko [off-list ref] wrote:
quoted
01.10.2021 16:39, Ulf Hansson пишет:
quoted
On Mon, 27 Sept 2021 at 00:42, Dmitry Osipenko [off-list ref] wrote:
quoted
Add runtime power management and support generic power domains.

Tested-by: Peter Geis <redacted> # Ouya T30
Tested-by: Paul Fertser <fercerpav@gmail.com> # PAZ00 T20
Tested-by: Nicolas Chauvet <redacted> # PAZ00 T20 and TK1 T124
Tested-by: Matt Merhar <redacted> # Ouya T30
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
---
 drivers/gpu/drm/tegra/gr2d.c | 155 +++++++++++++++++++++++++++++++++--
[...]
quoted
 static int gr2d_remove(struct platform_device *pdev)
@@ -259,15 +312,101 @@ static int gr2d_remove(struct platform_device *pdev)
                return err;
        }

+       pm_runtime_dont_use_autosuspend(&pdev->dev);
+       pm_runtime_disable(&pdev->dev);
There is no guarantee that the ->runtime_suspend() has been invoked
here, which means that clock may be left prepared/enabled beyond this
point.

I suggest you call pm_runtime_force_suspend(), instead of
pm_runtime_disable(), to make sure that gets done.
The pm_runtime_disable() performs the final synchronization, please see [1].

[1]
https://elixir.bootlin.com/linux/v5.15-rc3/source/drivers/base/power/runtime.c#L1412
pm_runtime_disable() end up calling _pm_runtime_barrier(), which calls
cancel_work_sync() if dev->power.request_pending has been set.

If the work that was punted to the pm_wq in rpm_idle() has not been
started yet, we end up just canceling it. In other words, there are no
guarantees it runs to completion.
You're right. Although, in a case of this particular patch, the syncing
is actually implicitly done by pm_runtime_dont_use_autosuspend().

But for drivers which don't use auto-suspend, there is no sync. This
looks like a disaster, it's a very common pattern for drivers to
'put+disable'.
quoted
Moreover, use space may have bumped the usage count via sysfs for the
device (pm_runtime_forbid()) to keep the device runtime resumed.
Right, this is also a disaster in a case of driver removal.
quoted
quoted
Calling pm_runtime_force_suspend() isn't correct because each 'enable'
must have the corresponding 'disable'. Hence there is no problem here.
pm_runtime_force_suspend() calls pm_runtime_disable(), so I think that
should be fine. No?
[adding Rafael]

Rafael, could you please explain how drivers are supposed to properly
suspend and disable RPM to cut off power and reset state that was
altered by the driver's resume callback? What we're missing? Is Ulf's
suggestion acceptable?

The RPM state of a device is getting reset on driver's removal, hence
all refcounts that were bumped by the rpm-resume callback of the device
driver will be screwed up if device is kept resumed after removal. I
just verified that it's true in practice.
Note that, what makes the Tegra drivers a bit special is that they are
always built with CONFIG_PM being set (selected from the "SoC"
Kconfig).

Therefore, pm_runtime_force_suspend() can work for some of these
cases. Using this, would potentially avoid the driver from having to
runtime resume the device in ->remove(), according to the below
generic sequence, which is used in many drivers.

pm_runtime_get_sync()
clk_disable_unprepare() (+ additional things to turn off the device)
pm_runtime_disable()
pm_runtime_put_noidle()
It's not a problem to change this patchset. The problem is that if
you'll grep mainline for 'pm_runtime_disable', you will find that there
are a lot of drivers in a potential trouble.
Let's start by fixing this patchset, please - then we can consider
what to do with the other cases separately.
quoted hunk ↗ jump to hunk
I'm proposing that we should change pm_runtime_disable() to perform the
syncing with this oneliner:
diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
index ec94049442b9..5c9f28165824 100644
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -1380,6 +1380,8 @@ EXPORT_SYMBOL_GPL(pm_runtime_barrier);
  */
 void __pm_runtime_disable(struct device *dev, bool check_resume)
 {
+       flush_work(&dev->power.work);
+
What about the latency this may introduce? I am not sure that is
acceptable here!?
        spin_lock_irq(&dev->power.lock);

        if (dev->power.disable_depth > 0) {

Objections?

The sysfs rpm-forbid is a separate problem and it's less troublesome
since it requires root privileges. It's also not something that
userspace touches casually. For now I don't know what could be done
about it.
As I said, the common method to address this problem is to run the
following sequence:

pm_runtime_get_sync()
"power off the device"
pm_runtime_disable()
pm_runtime_put_noidle()

This works even if user space, via sysfs, has triggered a call to
pm_runtime_forbid(). Or doesn't it?

If you don't like it, pm_runtime_force_suspend() should work too, at
least for your cases, I believe.

Kind regards
Uffe
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help