Thread (38 messages) 38 messages, 8 authors, 2018-08-01

Re: [net-next 10/16] net/mlx5: Support PCIe buffer congestion handling via Devlink

From: Jiri Pirko <jiri@resnulli.us>
Date: 2018-07-26 08:32:31

Thu, Jul 26, 2018 at 02:43:59AM CEST, jakub.kicinski@netronome.com wrote:
On Wed, 25 Jul 2018 08:23:26 -0700, Alexander Duyck wrote:
quoted
On Wed, Jul 25, 2018 at 5:31 AM, Eran Ben Elisha wrote:
quoted
On 7/24/2018 10:51 PM, Jakub Kicinski wrote:  
quoted
quoted
quoted
The devlink params haven't been upstream even for a full cycle and
already you guys are starting to use them to configure standard
features like queuing.  
We developed the devlink params in order to support non-standard
configuration only. And for non-standard, there are generic and vendor
specific options.  
I thought it was developed for performing non-standard and possibly
vendor specific configuration.  Look at DEVLINK_PARAM_GENERIC_* for
examples of well justified generic options for which we have no
other API.  The vendor mlx4 options look fairly vendor specific if you
ask me, too.

Configuring queuing has an API.  The question is it acceptable to enter
into the risky territory of controlling offloads via devlink parameters
or would we rather make vendors take the time and effort to model
things to (a subset) of existing APIs.  The HW never fits the APIs
perfectly.  
I understand what you meant here, I would like to highlight that this
mechanism was not meant to handle SRIOV, Representors, etc.
The vendor specific configuration suggested here is to handle a congestion
state in Multi Host environment (which includes PF and multiple VFs per
host), where one host is not aware to the other hosts, and each is running
on its own pci/driver. It is a device working mode configuration.

This  couldn't fit into any existing API, thus creating this vendor specific
unique API is needed.  
If we are just going to start creating devlink interfaces in for every
one-off option a device wants to add why did we even bother with
trying to prevent drivers from using sysfs? This just feels like we
are back to the same arguments we had back in the day with it.

I feel like the bigger question here is if devlink is how we are going
to deal with all PCIe related features going forward, or should we
start looking at creating a new interface/tool for PCI/PCIe related
features? My concern is that we have already had features such as DMA
Coalescing that didn't really fit into anything and now we are
starting to see other things related to DMA and PCIe bus credits. I'm
wondering if we shouldn't start looking at a tool/interface to
configure all the PCIe related features such as interrupts, error
reporting, DMA configuration, power management, etc. Maybe we could
even look at sharing it across subsystems and include things like
storage, graphics, and other subsystems in the conversation.
Agreed, for actual PCIe configuration (i.e. not ECN marking) we do need
to build up an API.  Sharing it across subsystems would be very cool!
I wonder howcome there isn't such API in place already. Or is it?
If it is not, do you have any idea how should it look like? Should it be
an extension of the existing PCI uapi or something completely new?
It would be probably good to loop some PCI people in...
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help