Re: [drivers/net/phy/sfp] intermittent failure in state machine checks
From: ѽ҉ᶬḳ℠ <hidden>
Date: 2020-01-10 12:46:01
On 10/01/2020 11:44, Russell King - ARM Linux admin wrote:
Which is also indicating everything is correct. When the problem occurs, check the state of the signals again as close as possible to the event - it depends how long the transceiver keeps it asserted. You will probably find tx-fault is indicating "in hi IRQ".
just discovered userland - gpioinfo pca9538 - which seems more verbose gpiochip2 - 8 lines: line 0: unnamed "tx-fault" input active-high [used] line 1: unnamed "tx-disable" output active-high [used] line 2: unnamed "rate-select0" input active-high [used] line 3: unnamed "los" input active-high [used] line 4: unnamed "mod-def0" input active-low [used] line 5: unnamed unused input active-high line 6: unnamed unused input active-high line 7: unnamed unused input active-high The above is depicting the current state with the module working, i.e. being online. Will do some testing and report back, not sure yet how to keep a close watch relating to the failure events.
quoted
- it would appear that SFP.C is trying to communicate with Fiber-GBIC and fails since the signal reports may not be 100% compatibleThat's a fun claim, but note carefully the wording "may" which implies some uncertainty in the statement.
It was a verbatim translation but yes, even in the initial language correspondence such uncertainty is implied indeed.
Let's look at the wording of the GBIC (SFF-8053) and SFP (INF-8074 -
SFP MSA) documents. The wording for the "fault recovery" is identical
between the two, which concerns what happens when TX_FAULT is asserted
and how to recover from that.
Concerning the implementation of TX_FAULT, SFF-8053 states:
If no transmitter safety circuitry is implemented, the TX_FAULT signal
may be tied to its negated state.
but then says later in the document:
If TX_FAULT is not implemented, the signal shall be held to the low
state by the GBIC.
Meanwhile, INF-8074 similarly states:
If no transmitter safety circuitry is implemented, the TX_FAULT signal
may be tied to its negated state.
but later on has a similar statement:
TX_FAULT shall be implemented by those module definitions of SFP
transceiver supporting safety circuitry. If TX_FAULT is not
implemented, the signal shall be held to the low state by the SFP
transceiver.
"shall" in both cases is stronger than "may". So, there seems to be
little difference between the GBIC and SFP usage of this signal.
Their claim is that sfp.c implements the older GBIC style of signal
reports. My counter-claim is that (a) sfp.c is written to the SFP MSA
and not the GBIC standard, and (b) there is no difference as far as the
TX_FAULT signal is concerned between the GBIC standard and the SFP MSA.
But... it doesn't matter that much, there's a module out there (and it
isn't the only one) which does "funny stuff" with its TX_FAULT signal.
Either we decide we want to support it and implement a quirk, or we
decide we don't want to support it.
There is an option bit in the EEPROM that is supposed to indicate
whether the module supports TX_FAULT, but, as you can guess, there are
problems with using that, as:
1) there are a lot of modules, particularly optical modules, that
implement TX_FAULT correctly but don't set the option bit to say
that they support the signal.
2) the other module I'm aware of that does "funny stuff" with its
TX_FAULT signal does have the TX_FAULT option bit set.
So, the option bit is completely untrustworthy and, therefore, is
meaningless (which is why we don't use it.)Even with "shall" carrying a potentially higher weight than "may" it still does not imply something obligatory (set in stone) and leaves potential wiggle room.