Thread (40 messages) 40 messages, 3 authors, 2020-01-11

Re: [drivers/net/phy/sfp] intermittent failure in state machine checks

From: ѽ҉ᶬḳ℠ <hidden>
Date: 2020-01-10 15:45:22

On 10/01/2020 15:09, Russell King - ARM Linux admin wrote:
On Fri, Jan 10, 2020 at 03:02:51PM +0000, ѽ҉ᶬḳ℠ wrote:
quoted
On 10/01/2020 12:53, Russell King - ARM Linux admin wrote:
quoted
quoted
quoted
Which is also indicating everything is correct. When the problem
occurs, check the state of the signals again as close as possible
to the event - it depends how long the transceiver keeps it
asserted. You will probably find tx-fault is indicating
"in hi IRQ".
just discovered userland - gpioinfo pca9538 - which seems more verbose

gpiochip2 - 8 lines:
        line   0:      unnamed   "tx-fault"   input active-high [used]
        line   1:      unnamed "tx-disable"  output active-high [used]
        line   2:      unnamed "rate-select0" input active-high [used]
        line   3:      unnamed        "los"   input active-high [used]
        line   4:      unnamed   "mod-def0"   input active-low [used]
        line   5:      unnamed       unused   input active-high
        line   6:      unnamed       unused   input active-high
        line   7:      unnamed       unused   input active-high

The above is depicting the current state with the module working, 
i.e. being
online. Will do some testing and report back, not sure yet how to 
keep a
close watch relating to the failure events.
However, that doesn't give the current levels of the inputs, so it's
useless for the purpose I've asked for.
Fair enough. Operational (online) state

gpiochip2: GPIOs 504-511, parent: i2c/8-0071, pca9538, can sleep:
 gpio-504 ( |tx-fault     ) in  lo IRQ
 gpio-505 ( |tx-disable   ) out lo
 gpio-506 ( |rate-select0 ) in  lo
 gpio-507 ( |los          ) in  lo IRQ
 gpio-508 ( |mod-def0     ) in  lo IRQ

And the same remained (unchanged) during/after the events (as closely 
I was
able to monitor) -> module transmit fault indicated
Try:

while ! grep -A4 'tx-fault.*in hi' /sys/kernel/debug/gpio; do :; done

which may have a better chance of catching it.
Suppose you are not interested in what happens with ifdown wan, so just 
for posterity

  gpio-504 (                    |tx-fault            ) in  hi IRQ
  gpio-505 (                    |tx-disable          ) out hi
  gpio-506 (                    |rate-select0        ) in  lo
  gpio-507 (                    |los                 ) in  lo IRQ
  gpio-508 (                    |mod-def0            ) in  lo IRQ


When the iif is brought up again and happens to trigger a transmit fault 
the hi is not being triggered however. And it did not try 5 times to 
recover from the fault, unless dmesg missed some

[Fri Jan 10 15:30:57 2020] mvneta f1034000.ethernet eth2: Link is Down
[Fri Jan 10 15:30:57 2020] IPv6: ADDRCONF(NETDEV_UP): eth2: link is not 
ready
[Fri Jan 10 15:31:13 2020] mvneta f1034000.ethernet eth2: configuring 
for inband/1000base-x link mode
[Fri Jan 10 15:31:13 2020] sfp sfp: module transmit fault indicated
[Fri Jan 10 15:31:15 2020] mvneta f1034000.ethernet eth2: Link is Up - 
1Gbps/Full - flow control off
[Fri Jan 10 15:31:16 2020] sfp sfp: module transmit fault recovered
[Fri Jan 10 15:31:16 2020] mvneta f1034000.ethernet eth2: Link is Down
[Fri Jan 10 15:31:16 2020] sfp sfp: module transmit fault indicated
[Fri Jan 10 15:31:19 2020] sfp sfp: module persistently indicates fault, 
disabling
[Fri Jan 10 15:31:21 2020] IPv6: ADDRCONF(NETDEV_UP): eth2: link is not 
ready
[Fri Jan 10 15:31:21 2020] mvneta f1034000.ethernet eth2: configuring 
for inband/1000base-x link mode
[Fri Jan 10 15:31:21 2020] sfp sfp: module transmit fault indicated
[Fri Jan 10 15:31:27 2020] sfp sfp: module persistently indicates fault, 
disabling

[Fri Jan 10 15:38:01 2020] IPv6: ADDRCONF(NETDEV_UP): eth2: link is not 
ready
[Fri Jan 10 15:38:01 2020] mvneta f1034000.ethernet eth2: configuring 
for inband/1000base-x link mode
[Fri Jan 10 15:38:01 2020] sfp sfp: module transmit fault indicated
[Fri Jan 10 15:38:07 2020] sfp sfp: module persistently indicates fault, 
disabling

[Fri Jan 10 15:40:48 2020] IPv6: ADDRCONF(NETDEV_UP): eth2: link is not 
ready
[Fri Jan 10 15:40:48 2020] mvneta f1034000.ethernet eth2: configuring 
for inband/1000base-x link mode
[Fri Jan 10 15:40:48 2020] sfp sfp: module transmit fault indicated
[Fri Jan 10 15:40:54 2020] sfp sfp: module persistently indicates fault, 
disabling

Had to reboot the node to regain WAN connectivity.





Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help