Thread (40 messages) 40 messages, 6 authors, 2022-07-19

Re: [RFC PATCH net-next 3/9] net: pcs: Add helpers for registering and finding PCSs

From: Sean Anderson <hidden>
Date: 2022-07-11 21:55:19
Also in: linux-devicetree, lkml


On 7/11/22 5:47 PM, Sean Anderson wrote:
Hi Russell,

On 7/11/22 4:59 PM, Russell King (Oracle) wrote:
quoted
Hi Sean,

It's a good attempt and may be nice to have, but I'm afraid the
implementation has a flaw to do with the lifetime of data structures
which always becomes a problem when we have multiple devices being
used in aggregate.

On Mon, Jul 11, 2022 at 12:05:13PM -0400, Sean Anderson wrote:
quoted
+/**
+ * pcs_get_tail() - Finish getting a PCS
+ * @pcs: The PCS to get, or %NULL if one could not be found
+ *
+ * This performs common operations necessary when getting a PCS (chiefly
+ * incrementing reference counts)
+ *
+ * Return: @pcs, or an error pointer on failure
+ */
+static struct phylink_pcs *pcs_get_tail(struct phylink_pcs *pcs)
+{
+	if (!pcs)
+		return ERR_PTR(-EPROBE_DEFER);
+
+	if (!try_module_get(pcs->ops->owner))
+		return ERR_PTR(-ENODEV);
What you're trying to prevent here is the PCS going away - but holding a
reference to the module doesn't prevent that with the driver model. The
driver model design is such that a device can be unbound from its driver
at any moment. Taking a reference to the module doesn't prevent that,
all it does is ensure that the user can't remove the module. It doesn't
mean that the "pcs" structure will remain allocated.
So how do things like (serdes) phys work? Presumably the same hazard
occurs any time a MAC uses a phy, because the phy can disappear at any time.

As it happens I can easily trigger an Oops by unbinding my serdes driver
and the plugging in an ethernet cable. Presumably this means that the phy
subsystem needs the devlink treatment? There are already several in-tree
MAC drivers using phys...
quoted
The second issue that this creates is if a MAC driver creates the PCS
and then "gets" it through this interface, then the MAC driver module
ends up being locked in until the MAC driver devices are all unbound,
which isn't friendly at all.
The intention here is not to use this for "internal" PCSs, but only for
external ones. I suppose you're referring to 
(looks like I forgot a sentence here)

...things like MAC->MDIO->PCS. I'll have to investigate whether this can
be removed properly.
quoted
So, anything that proposes to create a new subsystem where we have
multiple devices that make up an aggregate device needs to nicely cope
with any of those devices going away. For that to happen in this
instance, phylink would need to know that its in-use PCS for a
particular MAC is going away, then it could force the link down before
removing all references to the PCS device.

Another solution would be devlinks, but I am really not a fan of that
when there may be a single struct device backing multiple network
interfaces, where some of them may require PCS and others do not.
I wonder if we could add an enable/disable callback of some kind, and
remove the devlink when the PCS was not in use. Not ideal, but then all
you have to do is set the interface correctly before removing the PCS.
quoted
One
wouldn't want the network interface with nfs-root to suddenly go away
because a PCS was unbound from its driver!
Well, you can also do

echo "mmc0:0001" > /sys/bus/mmc/drivers/mmcblk/unbind

which will (depending on your system) have the same effect.

If being able to unbind any driver at any time is intended,
then I don't think we can save userspace from itself.
quoted
quoted
+	get_device(pcs->dev);
This helps, but not enough. All it means is the struct device won't
go away, the "pcs" can still go away if the device is unbound from the
driver.
--Sean
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help