Thread (69 messages) 69 messages, 11 authors, 2022-08-15

Re: Re: Re: Re: [PATCH v2 1/9] PM: domains: Delete usage of driver_deferred_probe_check_state()

From: Alexander Stein <hidden>
Date: 2022-07-06 13:02:15
Also in: linux-gpio, linux-iommu, linux-pm, lkml

Am Dienstag, 5. Juli 2022, 03:24:33 CEST schrieb Saravana Kannan:
On Mon, Jul 4, 2022 at 12:07 AM Alexander Stein

[off-list ref] wrote:
quoted
Am Freitag, 1. Juli 2022, 09:02:22 CEST schrieb Saravana Kannan:
quoted
On Thu, Jun 30, 2022 at 11:02 PM Alexander Stein

[off-list ref] wrote:
quoted
Hi Saravana,

Am Freitag, 1. Juli 2022, 02:37:14 CEST schrieb Saravana Kannan:
quoted
On Thu, Jun 23, 2022 at 5:08 AM Alexander Stein

[off-list ref] wrote:
quoted
Hi,

Am Dienstag, 21. Juni 2022, 09:28:43 CEST schrieb Tony Lindgren:
quoted
Hi,

* Saravana Kannan [off-list ref] [700101 02:00]:
quoted
Now that fw_devlink=on by default and fw_devlink supports
"power-domains" property, the execution will never get to the
point
where driver_deferred_probe_check_state() is called before the
supplier
has probed successfully or before deferred probe timeout has
expired.

So, delete the call and replace it with -ENODEV.
Looks like this causes omaps to not boot in Linux next. With
this
simple-pm-bus fails to probe initially as the power-domain is
not
yet available. On platform_probe() genpd_get_from_provider()
returns
-ENOENT.

Seems like other stuff is potentially broken too, any ideas on
how to fix this?
I think I'm hit by this as well, although I do not get a lockup.
In my case I'm using
arch/arm64/boot/dts/freescale/imx8mq-tqma8mq-mba8mx.dts and
probing of
38320000.blk-ctrl fails as the power-domain is not (yet) registed.
Ok, took a look.

The problem is that there are two drivers for the same device and
they
both initialize this device.

    gpc: gpc@303a0000 {
    
        compatible = "fsl,imx8mq-gpc";
    
    }

$ git grep -l "fsl,imx7d-gpc" -- drivers/
drivers/irqchip/irq-imx-gpcv2.c
drivers/soc/imx/gpcv2.c

IMHO, this is a bad/broken design.

So what's happening is that fw_devlink will block the probe of
38320000.blk-ctrl until 303a0000.gpc is initialized. And it stops
blocking the probe of 38320000.blk-ctrl as soon as the first driver
initializes the device. In this case, it's the irqchip driver.

I'd recommend combining these drivers into one. Something like the
patch I'm attaching (sorry for the attachment, copy-paste is
mangling
the tabs). Can you give it a shot please?
I tried this patch and it delayed the driver initialization (those of
UART
as
quoted
well BTW). Unfortunately the driver fails the same way:
Thanks for testing the patch!
quoted
quoted
[    1.125253] imx8m-blk-ctrl 38320000.blk-ctrl: error -ENODEV:
failed
to
attach power domain "bus"

More than that it even introduced some more errors:
quoted
[    0.008160] irq: no irq domain found for gpc@303a0000 !
So the idea behind my change was that as long as the irqchip isn't the
root of the irqdomain (might be using the terms incorrectly) like the
gic, you can make it a platform driver. And I was trying to hack up a
patch that's the equivalent of platform_irqchip_probe() (which just
ends up eventually calling the callback you use in IRQCHIP_DECLARE().
I probably made some mistake in the quick hack that I'm sure if
fixable.
quoted
quoted
[    0.013251] Failed to map interrupt for
/soc@0/bus@30400000/timer@306a0000
However, this timer driver also uses TIMER_OF_DECLARE() which can't
handle failure to get the IRQ (because it's can't -EPROBE_DEFER). So,
this means, the timer driver inturn needs to be converted to a
platform driver if it's supposed to work with the IRQCHIP_DECLARE()
being converted to a platform driver.

But that's a can of worms not worth opening. But then I remembered
this simpler workaround will work and it is pretty much a variant of
the workaround that's already in the gpc's irqchip driver to allow two
drivers to probe the same device (people really should stop doing
that).

Can you drop my previous hack patch and try this instead please? I'm
99% sure this will work.
diff --git a/drivers/irqchip/irq-imx-gpcv2.c
b/drivers/irqchip/irq-imx-gpcv2.c index b9c22f764b4d..8a0e82067924
100644
--- a/drivers/irqchip/irq-imx-gpcv2.c
+++ b/drivers/irqchip/irq-imx-gpcv2.c
@@ -283,6 +283,7 @@ static int __init imx_gpcv2_irqchip_init(struct
device_node *node,

         * later the GPC power domain driver will not be skipped.
         */
        
        of_node_clear_flag(node, OF_POPULATED);

+       fwnode_dev_initialized(domain->fwnode, false);

        return 0;
 
 }
Just to be sure here, I tried this patch on top of next-20220701 but
unfortunately this doesn't fix the original problem either. The timer
errors are gone though.
To clarify, you had the timer issue only with my "combine drivers" patch,
right?
That's correct.
quoted
The probe of imx8m-blk-ctrl got slightly delayed (from 0.74 to 0.90s
printk
time) but results in the identical error message.
My guess is that the probe attempt of blk-ctrl is delayed now till gpc
probes (because of the device links getting created with the
fwnode_dev_initialized() fix), but by the time gpc probe finishes, the
power domains aren't registered yet because of the additional level of
device addition and probing.

Can you try the attached patch please?
Sure, it needed some small fixes though. But the error still is present.
And if that doesn't fix the issues, then enable the debug logs in the
following functions please and share the logs from boot till the
failure? If you can enable CONFIG_PRINTK_CALLER, that'd help too.
device_link_add()
fwnode_link_add()
fw_devlink_relax_cycle()
I switched fw_devlink_relax_cycle() for fw_devlink_relax_link() as the former 
has no debug output here.

For the record I added the following line to my kernel command line:
dyndbg="func device_link_add +p; func fwnode_link_add +p; func 
fw_devlink_relax_link +p"

I attached the dmesg until the probe error to this mail. But I noticed the 
following lines which seem interesting:
[    1.466620][    T8] imx-pgc imx-pgc-domain.5: Linked as a consumer to
regulator.8
[    1.466743][    T8] imx-pgc imx-pgc-domain.5: imx_pgc_domain_probe: Probe 
succeeded
[    1.474733][    T8] imx-pgc imx-pgc-domain.6: Linked as a consumer to 
regulator.9
[    1.474774][    T8] imx-pgc imx-pgc-domain.6: imx_pgc_domain_probe: Probe 
succeeded

regulator.8 and regulator.9 is the power sequencer, attached on I2C. This also 
makes perfectly sense if you look at [1]ff. These power domains are supplied 
by specific power supply rails. Several, if not all, imx8mq boards have this 
kind of setting.
Btw, part of the reason I'm trying to make sure we fix it the right
way is that when we try to enable async boot by default, we don't run
into issues.
Sounds resonable.

Best regards,
Alexander

[1] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/
arch/arm64/boot/dts/freescale/imx8mq-tqma8mq.dtsi#n84

Attachments

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help