Thread (22 messages) 22 messages, 5 authors, 2023-08-30

Re: [PATCH 2/2] net: dsa: microchip: Provide Module 4 KSZ9477 errata (DS80000754C)

From: Lukasz Majewski <lukma@denx.de>
Date: 2023-08-29 12:39:43
Also in: lkml

Hi Oleksij,
Hi Lukasz,

On Tue, Aug 29, 2023 at 01:24:29PM +0200, Lukasz Majewski wrote:
quoted
Hi Vladimir,
  
quoted
Hi Lukasz,

On Tue, Aug 29, 2023 at 10:35:33AM +0200, Lukasz Majewski wrote:  
quoted
Hi Vladimir,
    
quoted
On Fri, Aug 25, 2023 at 06:48:41PM +0000,
Tristram.Ha@microchip.com wrote:    
quoted
quoted
quoted
IMHO adding functions to MMD modification would
facilitate further development (for example LED setup).
     
We already have some KSZ9477 specific initialization done
in the Micrel PHY driver under drivers/net/phy/micrel.c,
can we converge on the PHY driver which has a reasonable
amount of infrastructure for dealing with workarounds,
indirect or direct MMD accesses etc.?      
Actually the internal PHY used in the
KSZ9897/KSZ9477/KSZ9893 switches are special and only used
inside those switches. Putting all the switch related code
in Micrel PHY driver does not really help.  When the switch
is reset all those PHY registers need to be set again, but
the PHY driver only executes those code during PHY
initialization.  I do not know if there is a good way to
tell the PHY to re-initialize again. 
Suppose there was a method to tell the PHY driver to
re-initialize itself. What would be the key points in which
the DSA switch driver would need to trigger that method?
Where is the switch reset at runtime?    
Tristam has explained why adding the internal switch PHY errata
to generic PHY code is not optimal.    
Yes, and I didn't understand that explanation, so I asked a
clarification question.  
Ok. Let's wait for Tristram's answer.
  
quoted
  
quoted
If adding MMD generic code is a problem - then I'm fine with
just clearing proper bits with just two indirect writes in the
drivers/net/dsa/microchip/ksz9477.c

I would also prefer to keep the separate ksz9477_errata()
function, so we could add other errata code there.

Just informative - without this patch the KSZ9477-EVB board's
network is useless when the other peer has EEE enabled by
default (like almost all non managed ETH switches).    
No, adding direct PHY MMD access code to the ksz9477 switch
driver is not even the biggest problem - even though, IIUC, the
"workaround" to disable EEE advertisement could be moved to
ksz9477_get_features() in drivers/net/phy/micrel.c, where
phydev->supported_eee could be cleared.  
To be even more interesting (after looking into the PHY micrel.c
code):
https://elixir.bootlin.com/linux/latest/source/drivers/net/phy/micrel.c#L1804

The errata from this patch is already present.

The issue is that ksz9477_config_init() (drivers/net/phy/micrel.c)
is executed AFTER generic phy_probe():
https://elixir.bootlin.com/linux/latest/source/drivers/net/phy/phy_device.c#L3256
in which the EEE advertisement registers are read.

Hence, those registers needs to be cleared earlier - as I do in
ksz9477_setup() in drivers/net/dsa/microchip/ksz9477.

Here the precedence matters ...  
quoted
The biggest problem that I see is that Oleksij Rempel has "just"
added EEE support to the KSZ9477 earlier this year, with an ack
from Arun Ramadoss: 69d3b36ca045 ("net: dsa: microchip: enable
EEE support"). I'm not understanding why the erratum wasn't a
discussion topic then.  
+1  
As this erratum states:  "this feature _can_ cause link drops".
For example I was indeed able to have EEE relates issue between this
switch and a link partner with AR8035 PHY. Following patch addressing
this issue:
https://lore.kernel.org/all/20230327142202.3754446-8-o.rempel@pengutronix.de/ (local)
So, in this case KSZ9477 was not the bad side.
The errata: http://ww1.microchip.com/downloads/jp/DeviceDoc/jp599888.pdf

Module 4, "End user implications":
--------8<----------
If the link partner is not known, or if the link partner is EEE
capable, then the EEE feature should be manually disabled to avoid link
drop problems.
-------->8----------
Since this erratum do not describe exact cause of this issue
IMHO, it does - "The EEE feature is enabled by default, but it is not
fully operational. "

It looks like some silicon issue - which in details is probably only
known to Micrel/Microchip.
or
specific link partners where this functionality is not working, I
would prefer to give the user the freedom of choice.
The problem is that - the user - would encounter broken network when
connected to per advertising EEE. 

Hence, I would prefer to apply the Errata and then somebody, who would
like to enable EEE can try if it works for him.

IMHO, code to fix erratas shall be added unconditionally, without any
"freedom of choice".
The same issue we have with Pause Frame support. It is not always a
good choice, but user has freedom to configure it.

Today I wont to create a test setup with different EEE capable link
partners on one side and KSZ9477 on other side and let it run some
days. Just to make sure.

Beside, are you able to reproduce this issue?
Yes, I can reproduce the issue. I do use two Microchip's development
boards (KSZ9477-EVB [1]) connected together to test HSR as well as
communication with HOST PC.

The network on this board without this patch is not usable (continually
I do encounter link up/downs).

Another test scenario is to connect this board to non-managed ETH
switch (which shall have the EEE advertised by default).


Please be also aware, that this errata fix is (implicitly I think)
already present in the kernel:
https://elixir.bootlin.com/linux/latest/source/drivers/net/phy/micrel.c#L1804

However, the execution order of PHY/DSA functions with newest mainline
makes it not working any more (I've described it in details in the
earlier mail to Vladimir).
Regards,
Oleksij
Links:
[1] - https://www.microchip.com/en-us/development-tool/evb-ksz9477-1


Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Erika Unter
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de

Attachments

  • (unnamed) [application/pgp-signature] 488 bytes
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help