Thread (33 messages) 33 messages, 5 authors, 2022-10-11

Re: [RFC PATCH net-next v4 2/6] devlink: Extend devlink-rate api with queues and new parameters

From: Jiri Pirko <jiri@resnulli.us>
Date: 2022-09-29 07:13:11

Wed, Sep 28, 2022 at 01:47:03PM CEST, michal.wilczynski@intel.com wrote:

On 9/26/2022 1:51 PM, Jiri Pirko wrote:
quoted
Thu, Sep 15, 2022 at 08:41:52PM CEST, michal.wilczynski@intel.com wrote:
quoted
On 9/15/2022 5:31 PM, Edward Cree wrote:
quoted
On 15/09/2022 14:42, Michal Wilczynski wrote:
quoted
Currently devlink-rate only have two types of objects: nodes and leafs.
There is a need to extend this interface to account for a third type of
scheduling elements - queues. In our use case customer is sending
different types of traffic on each queue, which requires an ability to
assign rate parameters to individual queues.
Is there a use-case for this queue scheduling in the absence of a netdevice?
If not, then I don't see how this belongs in devlink; the configuration
   should instead be done in two parts: devlink-rate to schedule between
   different netdevices (e.g. VFs) and tc qdiscs (or some other netdev-level
   API) to schedule different queues within each single netdevice.
Please explain why this existing separation does not support your use-case.

Also I would like to see some documentation as part of this patch.  It looks
   like there's no kernel document for devlink-rate unlike most other devlink
   objects; perhaps you could add one?

-ed
Hi,
Previously we discussed adding queues to devlink-rate in this thread:
https://lore.kernel.org/netdev/20220704114513.2958937-1-michal.wilczynski@intel.com/T/#u (local)
In our use case we are trying to find a way to expose hardware Tx scheduler
tree that is defined
per port to user. Obviously if the tree is defined per physical port, all the
scheduling nodes will reside
on the same tree.

Our customer is trying to send different types of traffic that require
different QoS levels on the same
Do I understand that correctly, that you are assigning traffic to queues
in VM, and you rate the queues on hypervisor? Is that the goal?
Yes.
Why do you have this mismatch? If forces the hypervisor and VM admin to
somehow sync upon the configuration. That does not sound correct to me.

quoted
quoted
VM, but on a different queues. This requires completely different rate setups
for that queue - in the
implementation that you're mentioning we wouldn't be able to arbitrarily
reassign the queue to any node.
Those queues would still need to share a single parent - their netdev. This
So that replies to Edward's note about having the queues maintained
within the single netdev/vport, correct?
Correct ;)
Okay. So you don't really need any kind of sharing devlink might be able
to provide.

From what you say and how I see this, it's clear. You should handle the
per-queue shaping on the VM, on netdevice level, most probably by
offloading some of the TC qdisc.

quoted
quoted
wouldn't allow us to fully take
advantage of the HQoS and would introduce arbitrary limitations.

Also I would think that since there is only one vendor implementing this
particular devlink-rate API, there is
some room for flexibility.

Regarding the documentation,  sure. I just wanted to get all the feedback
from the mailing list and arrive at the final
solution before writing the docs.

BTW, I'm going to be out of office tomorrow, so will respond in this thread
on Monday.
BR,
Michał
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help