Re: [PATCH net-next v10 10/10] ice: add documentation for devlink-rate implementation
From: Jakub Kicinski <kuba@kernel.org>
Date: 2022-11-08 22:39:46
On Mon, 7 Nov 2022 19:13:26 +0100 Michal Wilczynski wrote:
Add documentation to a newly added devlink-rate feature. Provide some examples on how to use the features, which netlink attributes are supported and descriptions of the attributes.
+Devlink Rate +========== + +The ``ice`` driver implements devlink-rate API. It allows for offload of +the Hierarchical QoS to the hardware. It enables user to group Virtual +Functions in a tree structure and assign supported parameters: tx_share, +tx_max, tx_priority and tx_weight to each node in a tree. So effectively +user gains an ability to control how much bandwidth is allocated for each +VF group. This is later enforced by the HW. + +It is assumed that this feature is mutually exclusive with DCB and ADQ, or +any driver feature that would trigger changes in QoS, for example creation +of the new traffic class.
Meaning? Will the devlink API no longer reflect reality once one of the VFs enables DCB for example?
This feature is also dependent on switchdev +being enabled in the system. It's required bacause devlink-rate requires +devlink-port objects to be present, and those objects are only created +in switchdev mode. + +If the driver is set to the switchdev mode, it will export +internal hierarchy the moment the VF's are created. Root of the tree +is always represented by the node_0. This node can't be deleted by the user. +Leaf nodes and nodes with children also can't be deleted. + +.. list-table:: Attributes supported + :widths: 15 85 + + * - Name + - Description + * - ``tx_max`` + - This attribute allows for specifying a maximum bandwidth to be
Drop the "This attribute allows for specifying a" from all attrs.
+ consumed by the tree Node. Rate Limit is an absolute number + specifying a maximum amount of bytes a Node may consume during + the course of one second. Rate limit guarantees that a link will + not oversaturate the receiver on the remote end and also enforces + an SLA between the subscriber and network provider. + * - ``tx_share``
Wouldn't it be more common to call this tx_min, like in the old VF API and the cgroup APIs?
+ - This attribute allows for specifying a minimum bandwidth allocated + to a tree node when it is not blocked. It specifies an absolute + BW. While tx_max defines the maximum bandwidth the node may consume, + the tx_share marks committed BW for the Node. + * - ``tx_priority`` + - This attribute allows for usage of strict priority arbiter among + siblings. This arbitration scheme attempts to schedule nodes based + on their priority as long as the nodes remain within their + bandwidth limit. Range 0-7.
Nodes meaning it will (W)RR across all nodes of highest prio? Is prio 0 or 7 highest?
+ * - ``tx_weight`` + - This attribute allows for usage of Weighted Fair Queuing + arbitration scheme among siblings. This arbitration scheme can be + used simultaneously with the strict priority. Range 1-200.
Would be good to specify how the interaction with SP looks. Does the absolute value of the weight matter or only the relative values? (IOW is 1 vs 10 the same as 10 vs 100)