Thread (39 messages) 39 messages, 8 authors, 2025-02-05

RE: [PATCH 2/6] thermal: of: Export non-devres helper to register/unregister thermal zone

From: Biju Das <biju.das.jz@bp.renesas.com>
Date: 2025-01-30 10:33:28
Also in: linux-clk, linux-devicetree, linux-pm, linux-renesas-soc, lkml

Hi Daniel Lezcano, 
-----Original Message-----
From: Daniel Lezcano <redacted>
Sent: 30 January 2025 10:07
Subject: Re: [PATCH 2/6] thermal: of: Export non-devres helper to register/unregister thermal zone

On Thu, Jan 30, 2025 at 11:08:03AM +0200, Claudiu Beznea wrote:
quoted
Hi, Daniel,

On 29.01.2025 19:24, Daniel Lezcano wrote:
quoted
Hi Claudiu,

On Fri, Jan 03, 2025 at 06:38:01PM +0200, Claudiu wrote:
quoted
From: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>

On the Renesas RZ/G3S (and other Renesas SoCs, e.g., RZ/G2{L, LC,
UL}), clocks are managed through PM domains. These PM domains,
registered on behalf of the clock controller driver, are configured
with GENPD_FLAG_PM_CLK. In most of the Renesas drivers used by RZ
SoCs, the clocks are enabled/disabled using runtime PM APIs.

During probe, devices are attached to the PM domain controlling
their clocks. Similarly, during removal, devices are detached from the PM domain.

The detachment call stack is as follows:

device_driver_detach() ->
  device_release_driver_internal() ->
    __device_release_driver() ->
      device_remove() ->
        platform_remove() ->
	  dev_pm_domain_detach()

In the upcoming Renesas RZ/G3S thermal driver, the struct
thermal_zone_device_ops::change_mode API is implemented to
start/stop the thermal sensor unit. Register settings are updated
within the change_mode API.

In case devres helpers are used for thermal zone
register/unregister the struct thermal_zone_device_ops::change_mode
API is invoked when the driver is unbound. The identified call stack is as follows:

device_driver_detach() ->
  device_release_driver_internal() ->
    device_unbind_cleanup() ->
      devres_release_all() ->
        devm_thermal_of_zone_release() ->
	  thermal_zone_device_disable() ->
	    thermal_zone_device_set_mode() ->
	      rzg3s_thermal_change_mode()

The device_unbind_cleanup() function is called after the thermal
device is detached from the PM domain (via dev_pm_domain_detach()).

The rzg3s_thermal_change_mode() implementation calls
pm_runtime_resume_and_get()/pm_runtime_put_autosuspend()
before/after accessing the registers. However, during the unbind
scenario, the
devm_thermal_of_zone_release() is invoked after dev_pm_domain_detach().
Consequently, the clocks are not enabled, as the device is removed
from the PM domain at this time, leading to an Asynchronous SError Interrupt.
The system cannot be used after this.
I've been through the driver before responding to this change. What
is the benefit of powering down / up (or clock off / on) the thermal
sensor when reading the temperature ?

I can understand for disable / enable but I don't get for the
classic usage where a governor will be reading the temperature regularly.
I tried to be as power saving as possible both at runtime and after
the IP is not used anymore as the HW manual doesn't mentioned anything
about accuracy or implications of disabling the IP clock at runtime.
We use similar approach (of disabling clocks at runtime) for other IPs
in the RZ/G3S SoC as well.
quoted
Would the IP need some cycles to capture the temperature accurately
after the clock is enabled ?
There is nothing about this mentioned about this in the HW manual of
the RZ/G3S SoC. The only points mentioned are as described in the driver code:
- wait at least 3us after each IIO channel read
- wait at least 30us after enabling the sensor
- wait at least 50us after setting OE bit in TSU_SM

For this I chose to have it implemented as proposed.
IMO, disabling/enabling the clock between two reads through the pm runtime may not be a good thing,
especially if the system enters a thermal situation where it has to mitigate.
Just a question, You mean to avoid device destruction due to high temperature?? Assuming disabling the clk happens
when the temp reaches the boundary and re-enabling of the clk after a time(which involves monitoring the CLK ON
bit after enabling it, or a run time enable failure happens), where it exceeds the threshold??

Cheers,
Biju
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help