Thread (40 messages) 40 messages, 3 authors, 2020-01-11

Re: [drivers/net/phy/sfp] intermittent failure in state machine checks

From: ѽ҉ᶬḳ℠ <hidden>
Date: 2020-01-10 12:46:01

On 10/01/2020 11:44, Russell King - ARM Linux admin wrote:
Which is also indicating everything is correct.  When the problem
occurs, check the state of the signals again as close as possible
to the event - it depends how long the transceiver keeps it
asserted.  You will probably find tx-fault is indicating
"in  hi IRQ".
just discovered userland - gpioinfo pca9538 - which seems more verbose

gpiochip2 - 8 lines:
         line   0:      unnamed   "tx-fault"   input  active-high [used]
         line   1:      unnamed "tx-disable"  output  active-high [used]
         line   2:      unnamed "rate-select0" input active-high [used]
         line   3:      unnamed        "los"   input  active-high [used]
         line   4:      unnamed   "mod-def0"   input   active-low [used]
         line   5:      unnamed       unused   input  active-high
         line   6:      unnamed       unused   input  active-high
         line   7:      unnamed       unused   input  active-high

The above is depicting the current state with the module working, i.e. 
being online. Will do some testing and report back, not sure yet how to 
keep a close watch relating to the failure events.
quoted
- it would appear that SFP.C is trying to communicate with Fiber-GBIC and
fails since the signal reports may not be 100% compatible
That's a fun claim, but note carefully the wording "may" which implies
some uncertainty in the statement.
It was a verbatim translation but yes, even in the initial language 
correspondence such uncertainty is implied indeed.
Let's look at the wording of the GBIC (SFF-8053) and SFP (INF-8074 -
SFP MSA) documents.  The wording for the "fault recovery" is identical
between the two, which concerns what happens when TX_FAULT is asserted
and how to recover from that.

Concerning the implementation of TX_FAULT, SFF-8053 states:

   If no transmitter safety circuitry is implemented, the TX_FAULT signal
   may be tied to its negated state.

but then says later in the document:

   If TX_FAULT is not implemented, the signal shall be held to the low
   state by the GBIC.

Meanwhile, INF-8074 similarly states:

   If no transmitter safety circuitry is implemented, the TX_FAULT signal
   may be tied to its negated state.

but later on has a similar statement:

   TX_FAULT shall be implemented by those module definitions of SFP
   transceiver supporting safety circuitry. If TX_FAULT is not
   implemented, the signal shall be held to the low state by the SFP
   transceiver.

"shall" in both cases is stronger than "may".  So, there seems to be
little difference between the GBIC and SFP usage of this signal.

Their claim is that sfp.c implements the older GBIC style of signal
reports.  My counter-claim is that (a) sfp.c is written to the SFP MSA
and not the GBIC standard, and (b) there is no difference as far as the
TX_FAULT signal is concerned between the GBIC standard and the SFP MSA.

But... it doesn't matter that much, there's a module out there (and it
isn't the only one) which does "funny stuff" with its TX_FAULT signal.
Either we decide we want to support it and implement a quirk, or we
decide we don't want to support it.

There is an option bit in the EEPROM that is supposed to indicate
whether the module supports TX_FAULT, but, as you can guess, there are
problems with using that, as:

1) there are a lot of modules, particularly optical modules, that
    implement TX_FAULT correctly but don't set the option bit to say
    that they support the signal.

2) the other module I'm aware of that does "funny stuff" with its
    TX_FAULT signal does have the TX_FAULT option bit set.

So, the option bit is completely untrustworthy and, therefore, is
meaningless (which is why we don't use it.)
Even with "shall" carrying a potentially higher weight than "may" it 
still does not imply something obligatory (set in stone) and leaves 
potential wiggle room.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help