Thread (40 messages) 40 messages, 3 authors, 2020-01-11

Re: [drivers/net/phy/sfp] intermittent failure in state machine checks

From: ѽ҉ᶬḳ℠ <hidden>
Date: 2020-01-09 23:50:19

On 09/01/2020 23:10, Russell King - ARM Linux admin wrote:
Please don't use mii-tool with SFPs that do not have a PHY; the "PHY"
registers are emulated, and are there just for compatibility. Please
use ethtool in preference, especially for SFPs.
Sure, just ethtool is not much of help for this particular matter, all 
there is ethtool -m and according to you the EEPROM dump is not to be 
relied on.
CONFIG_DEBUG_GPIO is not the same as having debugfs support enabled.
If debugfs is enabled, then gpiolib will provide the current state
of gpios through debugfs.  debugfs is normally mounted on
/sys/kernel/debug, but may not be mounted by default depending on
policy.  Looking in /proc/filesystems will tell you definitively
whether debugfs is enabled or not in the kernel.
debugsfs is mounted but ls -af /sys/kernel/debug/gpio only producing 
(oddly):

/sys/kernel/debug/gpio
So, if that is correct...

Current OpenWRT is derived from 4.19-stable kernels, which include
experimental patches picked at some point from my "phy" branch, and
TOS is derived from OpenWRT.
This may not be correct since there are not many device targets in 
OpenWrt that feature a SFP cage (least as of today), the Turris Omnia 
might even be the sole one.
I did not check whether that the code was/is available in OpenWrt, and 
likely it is not, but it was in an earlier TOS version since their 
platforms apparently feature a SFP cage.
That makes it very difficult for anyone in the mainline kernel
community to do anything about this; sending you a patch is likely
useless since you're not going to be able to test it.
I understand, I just reached out all the way upstream since other 
available avenues, and started all the way downstream, did not produce 
anything tangible or even a response.
I am grateful that finally at least you obliged and shed some light on 
the matter. Maybe I should just try finding a module that is declared 
SPF MSA conform...
You think the state machines are doing something clever. They don't.
They are all very simple and quite dumb.
Not really, I assume it just does what it is supposed to do in line with 
current (industry) standards and best practices.
The only real way to get to the bottom of it is to manually enable
debug in sfp.c so its possible to watch what happens, not only with
the hardware signals but also what the state machines are doing.
However, I'm very certain that there is no problem with the state
machines, and it is that the Allnet module is raising TX_FAULT.
I am sure it does and I am pursuing Allnet for a response, albeit not 
looking promising at the moment. Once there is however I shall pick up 
the thread again.
I also think from what you've said above that rebuilding a kernel
to enable debug in sfp.c is going to not be possible for you.
No, I might be able to get this done for amd64 but with this ARM SoC 
there is all kind of other stuff (SPI, MTD, I2C, u-boot and whatnot) 
involved and I am afraid it will go sideways if I attempt compiling.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help