Thread (18 messages) 18 messages, 4 authors, 2020-06-30

RE: [PATCH net-next 1/2] mlxsw: core: Add ethtool support for QSFP-DD transceivers

From: Vadim Pasternak <hidden>
Date: 2020-06-30 09:37:23

-----Original Message-----
From: Ido Schimmel <redacted>
Sent: Tuesday, June 30, 2020 9:00 AM
To: Andrew Lunn <andrew@lunn.ch>; Vadim Pasternak
[off-list ref]
Cc: Adrian Pop <redacted>; netdev@vger.kernel.org;
davem@davemloft.net; kuba@kernel.org; Jiri Pirko [off-list ref];
mlxsw [off-list ref]; Ido Schimmel [off-list ref]
Subject: Re: [PATCH net-next 1/2] mlxsw: core: Add ethtool support for QSFP-
DD transceivers

On Tue, Jun 30, 2020 at 02:21:59AM +0200, Andrew Lunn wrote:
quoted
I've no practice experience with modules other than plain old SFPs,
1G. And those have all sorts of errors, even basic things like the CRC
are systematically incorrect because they are not recalculated after
adding the serial number. We have had people trying to submit patches
to ethtool to make it ignore bits so that it dumps more information,
because the manufacturer failed to set the correct bits, etc.

Ido, Adrian, what is your experience with these QSFP-DD devices. Are
they generally of better quality, the EEPROM can be trusted? Is there
any form of compliance test.
Vadim, I know you tested with at least two different QSFP-DD modules, can
you please share your experience?
I tested two types of QSFP-DD, cooper and optical from few vendors:
Innolight, SP (Source Photonics) and Mellanox customized transceivers.
We don't have enough statistics. I guess in all our systems in LAB we
validated about 150 - 200 cables. No one of them had wrong EEPROM.

But in all Mellanox systems QSFP reading works through the firmware
and firmware performs QSFP validation for stamping (some cable type
are considered as untrusted and firmware put them to the black list),
page checksum, power consuming criteria.

quoted
If we go down the path of using the discovery information, it means we
have no way for user space to try to correct for when the information
is incorrect. It cannot request specific pages. So maybe we should
consider an alternative?

The netlink ethtool gives us more flexibility. How about we make a new
API where user space can request any pages it want, and specify the
size of the page. ethtool can start out by reading page 0. That should
allow it to identify the basic class of device. It can then request
additional pages as needed.
Just to make sure I understand, this also means adding a new API towards
drivers, right? So that they only read from HW the requested info.
quoted
The nice thing about that is we don't need two parsers of the
discovery information, one in user and second in kernel space. We
don't need to guarantee these two parsers agree with each other, in
order to correctly decode what the kernel sent to user space. And user
space has the flexibility to work around known issues when
manufactures get their EEPROM wrong.
Sounds sane to me... I know that in the past Vadim had to deal with various
faulty modules. Vadim, is this something we can support? What happens if
user space requests a page that does not exist? For example, in the case of
QSFP-DD, lets say we do not provide page 03h but user space still wants it
because it believes manufacturer did not set correct bits.
Regarding faulty modules, as I wrote - validation is performed by firmware
and our software trust firmware.

Currently user space just asks for the buffer length.
I suppose in case we'll have additional API:
ethtool -m <if> <page> <offset> <size>
it would be possible to provide buffer only for the defined page and upto
valid size.

Pay attention that CMIS specification covers also others transceivers types
and some of them we are going to support through ethtool, like:
19h (OSFP 8x Pluggable Transceiver)
1Ah (SFP-DD Double Density 2x Pluggable Transceiver)
1Eh (QSFP with QSFP-DD memory map)

If I am not wrong 19h and 1Eh should have same layout as QSFP-DD and
SFP-DD is supposed to be similar, but shorter (page 02h is reserved, page
01h contains info, which for QSFP-DD sits at page 02h).
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help