Re: [PATCH v1 1/2] driver core: fw_devlink: Add support for FWNODE_FLAG_BROKEN_PARENT
From: Saravana Kannan <hidden>
Date: 2021-08-27 18:06:47
Also in:
linux-acpi, lkml
On Fri, Aug 27, 2021 at 6:44 AM Andrew Lunn [off-list ref] wrote:
quoted
fw_devlink=on/device links short circuits the probe() call of a consumer (in this case the PHY) and returns -EPROBE_DEFER if the supplier's (in this case switch) probe hasn't finished without an error. fw_devlink/device links effectively does the probe in graph topological order and there's a ton of good reasons to do it that way -- what's why fw_devlink=on was implemented. In this specific case though, since the PHY depends on the parent device, if we fail the parent's probe realtek_smi_probe() because the PHYs failed to probe, we'll get into a catch-22/chicken-n-egg situation and the switch/PHYs will never probe.So lets look at: arch/arm/boot/dts/vf610-zii-dev-rev-b.dts mdio-mux { compatible = "mdio-mux-gpio"; pinctrl-0 = <&pinctrl_mdio_mux>; pinctrl-names = "default"; gpios = <&gpio0 8 GPIO_ACTIVE_HIGH &gpio0 9 GPIO_ACTIVE_HIGH &gpio0 24 GPIO_ACTIVE_HIGH &gpio0 25 GPIO_ACTIVE_HIGH>; mdio-parent-bus = <&mdio1>; #address-cells = <1>; #size-cells = <0>; We have an MDIO multiplexor mdio_mux_1: mdio@1 { reg = <1>; #address-cells = <1>; #size-cells = <0>; switch0: switch@0 { compatible = "marvell,mv88e6085"; pinctrl-0 = <&pinctrl_gpio_switch0>; pinctrl-names = "default"; reg = <0>; dsa,member = <0 0>; interrupt-parent = <&gpio0>; interrupts = <27 IRQ_TYPE_LEVEL_LOW>; On the first bus, we have a Ethernet switch. interrupt-controller; #interrupt-cells = <2>; eeprom-length = <512>; ports { #address-cells = <1>; #size-cells = <0>; port@0 { reg = <0>; label = "lan0"; phy-handle = <&switch0phy0>; }; The first port of that switch has a pointer to a PHY. mdio { #address-cells = <1>; #size-cells = <0>; That Ethernet switch also has an MDIO bus, switch0phy0: switch0phy0@0 { reg = <0>; On that bus is the PHY. interrupt-parent = <&switch0>; interrupts = <0 IRQ_TYPE_LEVEL_HIGH>; And that PHY has an interrupt. And that interrupt is provided by the switch. Given your description, it sounds like this is also go to break.
Based on what you pasted here (I didn't look any closer), I think it will break too.
vf610-zii-dev-rev-c.dts is the same pattern, and there are more examples for mv88e6xxx. It is a common pattern, e.g. the mips ar9331.dtsi follows it.
Then I think this should be solved at the DSA framework level. Make a component-master/aggregate device made up of the switches and ports/PHYs. Then wait for all of them to not -EPROBE_DEFER and then initialize the DSA?
I've not yet looked at plain Ethernet drivers. This pattern could also exist there. And i wonder about other complex structures, i2c bus multiplexors, you can have interrupt controllers as i2c devices, etc. So the general case could exist in other places.
I haven't seen any generic issues like this reported so far. It's only after adding phy-handle that we are hitting these issues with DSA switches.
I don't think we should be playing whack-a-mole by changing drivers as we find they regress and break. We need a generic fix. I think the solution is pretty clear. As you said the device depends on its parent. DT is a tree, so it is easy to walk up the tree to detect this relationship, and not fail the probe.
It's easy to do, but it is the wrong behavior for fw_devlink=on. There are plenty of cases where it's better to delay the child device's probe until the parent finishes. You even gave an example[7] where it would help avoid unnecessary deferred probes. There are plenty of other cases like this too -- there's actually a USB driver that had an infinite deferred probe loop that fw_devlink=on fixes. Also, the whole point of fw_devlink=on is to enforce ordering like this -- so just blanket ignoring dependencies on parent devices doesn't make sense. But a parent device's probe depending on a child device's probe to succeed as soon as it's added is never right though. So I think that's what needs to be addresses. So we have a couple of options: 1. Use a component driver model to initialize switches. I think it could be doable at the DSA framework level. 2. Ask fw_devlink=on to ignore it for all switch devices -- it might be possible to move my "quick fix" to the DSA framework. 3. Remove fw_devlink support for phy-handle. I honestly think (1) is the best option and makes sense logically too. Not saying it's a trivial work or a one liner, but it actually makes sense. (2) might not be possible -- I need to take a closer look. I'd prefer not doing (3), but I'd take that over breaking the whole point of fw_devlink=on. -Saravana [7] - https://lore.kernel.org/netdev/YR5eMeKzcuYtB6Tk@lunn.ch/ (local)