Re: [drivers/net/phy/sfp] intermittent failure in state machine checks
From: ѽ҉ᶬḳ℠ <hidden>
Date: 2020-01-10 15:45:22
On 10/01/2020 15:09, Russell King - ARM Linux admin wrote:
On Fri, Jan 10, 2020 at 03:02:51PM +0000, ѽ҉ᶬḳ℠ wrote:quoted
On 10/01/2020 12:53, Russell King - ARM Linux admin wrote:quoted
quoted
quoted
Which is also indicating everything is correct. When the problem occurs, check the state of the signals again as close as possible to the event - it depends how long the transceiver keeps it asserted. You will probably find tx-fault is indicating "in hi IRQ".just discovered userland - gpioinfo pca9538 - which seems more verbose gpiochip2 - 8 lines: line 0: unnamed "tx-fault" input active-high [used] line 1: unnamed "tx-disable" output active-high [used] line 2: unnamed "rate-select0" input active-high [used] line 3: unnamed "los" input active-high [used] line 4: unnamed "mod-def0" input active-low [used] line 5: unnamed unused input active-high line 6: unnamed unused input active-high line 7: unnamed unused input active-high The above is depicting the current state with the module working, i.e. being online. Will do some testing and report back, not sure yet how to keep a close watch relating to the failure events.However, that doesn't give the current levels of the inputs, so it's useless for the purpose I've asked for.Fair enough. Operational (online) state gpiochip2: GPIOs 504-511, parent: i2c/8-0071, pca9538, can sleep: gpio-504 ( |tx-fault ) in lo IRQ gpio-505 ( |tx-disable ) out lo gpio-506 ( |rate-select0 ) in lo gpio-507 ( |los ) in lo IRQ gpio-508 ( |mod-def0 ) in lo IRQ And the same remained (unchanged) during/after the events (as closely I was able to monitor) -> module transmit fault indicatedTry: while ! grep -A4 'tx-fault.*in hi' /sys/kernel/debug/gpio; do :; done which may have a better chance of catching it.
Suppose you are not interested in what happens with ifdown wan, so just for posterity gpio-504 ( |tx-fault ) in hi IRQ gpio-505 ( |tx-disable ) out hi gpio-506 ( |rate-select0 ) in lo gpio-507 ( |los ) in lo IRQ gpio-508 ( |mod-def0 ) in lo IRQ When the iif is brought up again and happens to trigger a transmit fault the hi is not being triggered however. And it did not try 5 times to recover from the fault, unless dmesg missed some [Fri Jan 10 15:30:57 2020] mvneta f1034000.ethernet eth2: Link is Down [Fri Jan 10 15:30:57 2020] IPv6: ADDRCONF(NETDEV_UP): eth2: link is not ready [Fri Jan 10 15:31:13 2020] mvneta f1034000.ethernet eth2: configuring for inband/1000base-x link mode [Fri Jan 10 15:31:13 2020] sfp sfp: module transmit fault indicated [Fri Jan 10 15:31:15 2020] mvneta f1034000.ethernet eth2: Link is Up - 1Gbps/Full - flow control off [Fri Jan 10 15:31:16 2020] sfp sfp: module transmit fault recovered [Fri Jan 10 15:31:16 2020] mvneta f1034000.ethernet eth2: Link is Down [Fri Jan 10 15:31:16 2020] sfp sfp: module transmit fault indicated [Fri Jan 10 15:31:19 2020] sfp sfp: module persistently indicates fault, disabling [Fri Jan 10 15:31:21 2020] IPv6: ADDRCONF(NETDEV_UP): eth2: link is not ready [Fri Jan 10 15:31:21 2020] mvneta f1034000.ethernet eth2: configuring for inband/1000base-x link mode [Fri Jan 10 15:31:21 2020] sfp sfp: module transmit fault indicated [Fri Jan 10 15:31:27 2020] sfp sfp: module persistently indicates fault, disabling [Fri Jan 10 15:38:01 2020] IPv6: ADDRCONF(NETDEV_UP): eth2: link is not ready [Fri Jan 10 15:38:01 2020] mvneta f1034000.ethernet eth2: configuring for inband/1000base-x link mode [Fri Jan 10 15:38:01 2020] sfp sfp: module transmit fault indicated [Fri Jan 10 15:38:07 2020] sfp sfp: module persistently indicates fault, disabling [Fri Jan 10 15:40:48 2020] IPv6: ADDRCONF(NETDEV_UP): eth2: link is not ready [Fri Jan 10 15:40:48 2020] mvneta f1034000.ethernet eth2: configuring for inband/1000base-x link mode [Fri Jan 10 15:40:48 2020] sfp sfp: module transmit fault indicated [Fri Jan 10 15:40:54 2020] sfp sfp: module persistently indicates fault, disabling Had to reboot the node to regain WAN connectivity.