Re: [PATCH v4 net-next 0/8] Let phylink manage in-band AN for the PHY
From: Vladimir Oltean <vladimir.oltean@nxp.com>
Date: 2022-11-22 00:17:15
On Mon, Nov 21, 2022 at 05:42:44PM -0500, Sean Anderson wrote:
Are you certain this is the cause of the issue? It's also possible that there is some errata for the PCS which is causing the issue. I have gotten no review/feedback from NXP regarding the phylink conversion (aside from acks for the cleanups).
Erratum which does what out of the ordinary? Your description of the hardware failure seems consistent with the most plausible explanation that doesn't involve any bugs. If you enable C37/SGMII AN in the PCS (of the PHY or of the MAC) and AN does not complete (because it's not enabled on the other end), that system side of the link remains down. Which you don't see when you operate in MLO_AN_PHY mode, because phylink only considers the PCS link state in MLO_AN_INBAND mode. So this is why you see the link as up but it doesn't work. To confirm whether I'm right or wrong, there's a separate SERDES Interrupt Status Register at page 0xde1 offset 0x12, whose bit 4 is "SERDES link status change" and bit 0 is "SERDES auto-negotiation error". These bits should both be set when you double-read them (regardless of IRQ enable I think) when your link is down with MLO_AN_PHY, but should be cleared with MLO_AN_INBAND.
This is used for SGMII to RGMII bridge mode (figure 4). It doesn't seem to contain useful information for UTP mode (figure 1).
So it would seem. It was a hasty read last time, sorry. Re-reading, the field says that when it's set, the SGMII code word being transmitted is "selected by the register" SGMII ANAR. And in the SGMII ANLPAR, you can see what the MAC said. Of course, it doesn't say what happens when the bit for software-driven SGMII autoneg is *not* set, if the process can be at all bypassed. I suppose now that it can't, otherwise the ANLPAR register could also be writable over MDIO, they would have likely reused at least partly the same mechanisms.
quoted
+ ret = phy_read_paged(phydev, 0xd08, RTL8211FS_SGMII_ANARSEL);That said, you have to use the "Indirect access method" to access this register (per section 8.5). This is something like #define RTL8211F_IAAR 0x1b #define RTL8211F_IADR 0x1c #define RTL8211F_IAAR_PAGE GENMASK(15, 4) #define RTL8211F_IAAR_REG GENMASK(3, 1) #define INDIRECT_ADDRESS(page, reg) \ (FIELD_PREP(RTL8211F_IAAR_PAGE, page) | \ FIELD_PREP(RTL8211F_IAAR_REG, reg - 16)) ret = phy_write_paged(phydev, 0xa43, RTL8211F_IAAR, INDIRECT_ADDRESS(0xd08, RTL8211FS_SGMII_ANARSEL)); if (ret < 0) return ret; ret = phy_read_paged(phydev, 0xa43, RTL8211F_IADR); if (ret < 0) return ret; I dumped the rest of the serdes registers using this method, but I didn't see anything interesting (all defaults).
I'm _really_ not sure where you got the "Indirect access method" via registers 0x1b/0x1c from. My datasheet for RTL8211FS doesn't show offsets 0x1b and 0x1c in page 0xa43. Additionally, I cross-checked with other registers that are accessed by the driver (like the Interrupt Enable Register), and the driver access procedure - phy_write_paged(phydev, 0xa42, RTL821x_INER, val) - seems to be pretty much in line with what my datasheet shows.
I think it would be better to just return PHY_AN_INBAND_ON when using SGMII.
Well, of course hardcoding PHY_AN_INBAND_ON in the driver is on the table, if it isn't possible to alter this setting to the best of our knowledge (or if it's implausible that someone modified it). And this seems more and more like the case.