RE: [PATCH net-next 1/2] mlxsw: core: Add ethtool support for QSFP-DD transceivers
From: Vadim Pasternak <hidden>
Date: 2020-06-30 09:37:23
-----Original Message----- From: Ido Schimmel <redacted> Sent: Tuesday, June 30, 2020 9:00 AM To: Andrew Lunn <andrew@lunn.ch>; Vadim Pasternak [off-list ref] Cc: Adrian Pop <redacted>; netdev@vger.kernel.org; davem@davemloft.net; kuba@kernel.org; Jiri Pirko [off-list ref]; mlxsw [off-list ref]; Ido Schimmel [off-list ref] Subject: Re: [PATCH net-next 1/2] mlxsw: core: Add ethtool support for QSFP- DD transceivers On Tue, Jun 30, 2020 at 02:21:59AM +0200, Andrew Lunn wrote:quoted
I've no practice experience with modules other than plain old SFPs, 1G. And those have all sorts of errors, even basic things like the CRC are systematically incorrect because they are not recalculated after adding the serial number. We have had people trying to submit patches to ethtool to make it ignore bits so that it dumps more information, because the manufacturer failed to set the correct bits, etc. Ido, Adrian, what is your experience with these QSFP-DD devices. Are they generally of better quality, the EEPROM can be trusted? Is there any form of compliance test.Vadim, I know you tested with at least two different QSFP-DD modules, can you please share your experience?
I tested two types of QSFP-DD, cooper and optical from few vendors: Innolight, SP (Source Photonics) and Mellanox customized transceivers. We don't have enough statistics. I guess in all our systems in LAB we validated about 150 - 200 cables. No one of them had wrong EEPROM. But in all Mellanox systems QSFP reading works through the firmware and firmware performs QSFP validation for stamping (some cable type are considered as untrusted and firmware put them to the black list), page checksum, power consuming criteria.
quoted
If we go down the path of using the discovery information, it means we have no way for user space to try to correct for when the information is incorrect. It cannot request specific pages. So maybe we should consider an alternative? The netlink ethtool gives us more flexibility. How about we make a new API where user space can request any pages it want, and specify the size of the page. ethtool can start out by reading page 0. That should allow it to identify the basic class of device. It can then request additional pages as needed.Just to make sure I understand, this also means adding a new API towards drivers, right? So that they only read from HW the requested info.quoted
The nice thing about that is we don't need two parsers of the discovery information, one in user and second in kernel space. We don't need to guarantee these two parsers agree with each other, in order to correctly decode what the kernel sent to user space. And user space has the flexibility to work around known issues when manufactures get their EEPROM wrong.Sounds sane to me... I know that in the past Vadim had to deal with various faulty modules. Vadim, is this something we can support? What happens if user space requests a page that does not exist? For example, in the case of QSFP-DD, lets say we do not provide page 03h but user space still wants it because it believes manufacturer did not set correct bits.
Regarding faulty modules, as I wrote - validation is performed by firmware and our software trust firmware. Currently user space just asks for the buffer length. I suppose in case we'll have additional API: ethtool -m <if> <page> <offset> <size> it would be possible to provide buffer only for the defined page and upto valid size. Pay attention that CMIS specification covers also others transceivers types and some of them we are going to support through ethtool, like: 19h (OSFP 8x Pluggable Transceiver) 1Ah (SFP-DD Double Density 2x Pluggable Transceiver) 1Eh (QSFP with QSFP-DD memory map) If I am not wrong 19h and 1Eh should have same layout as QSFP-DD and SFP-DD is supposed to be similar, but shorter (page 02h is reserved, page 01h contains info, which for QSFP-DD sits at page 02h).