Thread (26 messages) 26 messages, 4 authors, 2022-07-14

Re: [PATCH net-next 0/5] devlink rate police limiter

From: Jiri Pirko <hidden>
Date: 2022-07-07 11:20:21

Thu, Jun 30, 2022 at 08:13:27PM CEST, kuba@kernel.org wrote:
On Thu, 30 Jun 2022 17:27:08 +0200 Dima Chumak wrote:
quoted
I've re-read more carefully the cover letter of the original 'devlink:
rate objects API' series by Dmytro Linkin, off of which I based my
patches, though my understanding still might be incomplete/incorrect
here.

It seems that TC, being ingress only, doesn't cover the full spectrum of
rate-limiting that's possible to achieve with devlink. TC works only
with representors and doesn't allow to configure "the other side of the
wire", where devlink port function seems to be a better match as it
connects directly to a VF.
Right, but you are adding Rx and Tx now, IIUC, so you're venturing into
the same "side of the wire" where tc lives.
Wait. Lets draw the basic picture of "the wire":

--------------------------+                +--------------------------
eswitch representor netdev|=====thewire====|function (vf/sf/whatever
--------------------------+                +-------------------------

Now the rate setting Dima is talking about, it is the configuration of
the "function" side. Setting the rate is limitting the "function" TX/RX
Note that this function could be of any type - netdev, rdma, vdpa, nvme.
Configuring the TX/RX rate (including groupping) applies to all of
these.

Putting the configuration on the eswitch representor does not fit:
1) it is configuring the other side of the wire, the configuration
   should be of the eswitch port. Configuring the other side is
   confusing and misleading. For the purpose of configuring the
   "function" side, we introduced "port function" object in devlink.
2) it is confuguring netdev/ethernet however the confuguration applies
   to all queues of the function.

quoted
Also, for the existing devlink-rate mechanism of VF grouping, it would be
challenging to achieve similar functionality with TC flows, as groups don't
have a net device instance where flows can be attached.
You can share actions in TC. The hierarchical aspects may be more
limited, not sure.
quoted
I want to apologize in case my proposed changes have come across as
being bluntly ignoring some of the pre-established agreements and
understandings of TC / devlink responsibility separation, it wasn't
intentional.
Apologies, TBH I thought you're the same person I was arguing with last
time.

My objective is to avoid having multiple user space interfaces which 
drivers have to (a) support and (b) reconcile. We already have the VF 
rate limits in ip link, and in TC (which I believe is used by OvS
offload). 

I presume you have a mlx5 implementation ready, so how do you reconcile
those 3 APIs?
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help