Thread (57 messages) 57 messages, 6 authors, 2021-09-30

Re: [PATCH v1 1/2] driver core: fw_devlink: Add support for FWNODE_FLAG_BROKEN_PARENT

From: Saravana Kannan <hidden>
Date: 2021-08-27 18:06:47
Also in: linux-acpi, lkml

On Fri, Aug 27, 2021 at 6:44 AM Andrew Lunn [off-list ref] wrote:
quoted
fw_devlink=on/device links short circuits the probe() call of a
consumer (in this case the PHY) and returns -EPROBE_DEFER if the
supplier's (in this case switch) probe hasn't finished without an
error. fw_devlink/device links effectively does the probe in graph
topological order and there's a ton of good reasons to do it that way
-- what's why fw_devlink=on was implemented.

In this specific case though, since the PHY depends on the parent
device, if we fail the parent's probe realtek_smi_probe() because the
PHYs failed to probe, we'll get into a catch-22/chicken-n-egg
situation and the switch/PHYs will never probe.
So lets look at:

arch/arm/boot/dts/vf610-zii-dev-rev-b.dts

       mdio-mux {
                compatible = "mdio-mux-gpio";
                pinctrl-0 = <&pinctrl_mdio_mux>;
                pinctrl-names = "default";
                gpios = <&gpio0 8  GPIO_ACTIVE_HIGH
                         &gpio0 9  GPIO_ACTIVE_HIGH
                         &gpio0 24 GPIO_ACTIVE_HIGH
                         &gpio0 25 GPIO_ACTIVE_HIGH>;
                mdio-parent-bus = <&mdio1>;
                #address-cells = <1>;
                #size-cells = <0>;


We have an MDIO multiplexor


                mdio_mux_1: mdio@1 {
                        reg = <1>;
                        #address-cells = <1>;
                        #size-cells = <0>;

                        switch0: switch@0 {
                                compatible = "marvell,mv88e6085";
                                pinctrl-0 = <&pinctrl_gpio_switch0>;
                                pinctrl-names = "default";
                                reg = <0>;
                                dsa,member = <0 0>;
                                interrupt-parent = <&gpio0>;
                                interrupts = <27 IRQ_TYPE_LEVEL_LOW>;

On the first bus, we have a Ethernet switch.

                                interrupt-controller;
                                #interrupt-cells = <2>;
                                eeprom-length = <512>;

                                ports {
                                        #address-cells = <1>;
                                        #size-cells = <0>;

                                        port@0 {
                                                reg = <0>;
                                                label = "lan0";
                                                phy-handle = <&switch0phy0>;
                                        };

The first port of that switch has a pointer to a PHY.

                               mdio {
                                        #address-cells = <1>;
                                        #size-cells = <0>;

That Ethernet switch also has an MDIO bus,

                                        switch0phy0: switch0phy0@0 {
                                                reg = <0>;

On that bus is the PHY.

                                                interrupt-parent = <&switch0>;
                                                interrupts = <0 IRQ_TYPE_LEVEL_HIGH>;

And that PHY has an interrupt. And that interrupt is provided by the switch.

Given your description, it sounds like this is also go to break.
Based on what you pasted here (I didn't look any closer), I think it
will break too.
vf610-zii-dev-rev-c.dts is the same pattern, and there are more
examples for mv88e6xxx.

It is a common pattern, e.g. the mips ar9331.dtsi follows it.
Then I think this should be solved at the DSA framework level. Make a
component-master/aggregate device made up of the switches and
ports/PHYs. Then wait for all of them to not -EPROBE_DEFER and then
initialize the DSA?
I've not yet looked at plain Ethernet drivers. This pattern could also
exist there. And i wonder about other complex structures, i2c bus
multiplexors, you can have interrupt controllers as i2c devices,
etc. So the general case could exist in other places.
I haven't seen any generic issues like this reported so far. It's only
after adding phy-handle that we are hitting these issues with DSA
switches.
I don't think we should be playing whack-a-mole by changing drivers as
we find they regress and break. We need a generic fix. I think the
solution is pretty clear. As you said the device depends on its
parent. DT is a tree, so it is easy to walk up the tree to detect this
relationship, and not fail the probe.
It's easy to do, but it is the wrong behavior for fw_devlink=on. There
are plenty of cases where it's better to delay the child device's
probe until the parent finishes. You even gave an example[7] where it
would help avoid unnecessary deferred probes. There are plenty of
other cases like this too -- there's actually a USB driver that had an
infinite deferred probe loop that fw_devlink=on fixes. Also, the whole
point of fw_devlink=on is to enforce ordering like this -- so just
blanket ignoring dependencies on parent devices doesn't make sense.

But a parent device's probe depending on a child device's probe to
succeed as soon as it's added is never right though. So I think that's
what needs to be addresses.

So we have a couple of options:
1. Use a component driver model to initialize switches. I think it
could be doable at the DSA framework level.
2. Ask fw_devlink=on to ignore it for all switch devices -- it might
be possible to move my "quick fix" to the DSA framework.
3. Remove fw_devlink support for phy-handle.

I honestly think (1) is the best option and makes sense logically too.
Not saying it's a trivial work or a one liner, but it actually makes
sense. (2) might not be possible -- I need to take a closer look. I'd
prefer not doing (3), but I'd take that over breaking the whole point
of fw_devlink=on.

-Saravana

[7] - https://lore.kernel.org/netdev/YR5eMeKzcuYtB6Tk@lunn.ch/ (local)
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help