Thread (18 messages) 18 messages, 4 authors, 2020-06-30

Re: [PATCH net-next 1/2] mlxsw: core: Add ethtool support for QSFP-DD transceivers

From: Ido Schimmel <hidden>
Date: 2020-06-30 05:59:54

On Tue, Jun 30, 2020 at 02:21:59AM +0200, Andrew Lunn wrote:
I've no practice experience with modules other than plain old SFPs,
1G. And those have all sorts of errors, even basic things like the CRC
are systematically incorrect because they are not recalculated after
adding the serial number. We have had people trying to submit patches
to ethtool to make it ignore bits so that it dumps more information,
because the manufacturer failed to set the correct bits, etc.

Ido, Adrian, what is your experience with these QSFP-DD devices. Are
they generally of better quality, the EEPROM can be trusted? Is there
any form of compliance test.
Vadim, I know you tested with at least two different QSFP-DD modules,
can you please share your experience?
If we go down the path of using the discovery information, it means we
have no way for user space to try to correct for when the information
is incorrect. It cannot request specific pages. So maybe we should
consider an alternative?

The netlink ethtool gives us more flexibility. How about we make a new
API where user space can request any pages it want, and specify the
size of the page. ethtool can start out by reading page 0. That should
allow it to identify the basic class of device. It can then request
additional pages as needed.
Just to make sure I understand, this also means adding a new API towards
drivers, right? So that they only read from HW the requested info.
The nice thing about that is we don't need two parsers of the
discovery information, one in user and second in kernel space. We
don't need to guarantee these two parsers agree with each other, in
order to correctly decode what the kernel sent to user space. And user
space has the flexibility to work around known issues when
manufactures get their EEPROM wrong.
Sounds sane to me... I know that in the past Vadim had to deal with
various faulty modules. Vadim, is this something we can support? What
happens if user space requests a page that does not exist? For example,
in the case of QSFP-DD, lets say we do not provide page 03h but user
space still wants it because it believes manufacturer did not set
correct bits.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help