Thread (49 messages) 49 messages, 6 authors, 2021-01-19

Re: [PATCH mlx5-next v1 1/5] PCI: Add sysfs callback to allow MSI-X table size change of SR-IOV VFs

From: Leon Romanovsky <leon@kernel.org>
Date: 2021-01-12 06:16:22
Also in: linux-pci, netdev

On Mon, Jan 11, 2021 at 10:25:42PM -0500, Don Dutile wrote:
On 1/11/21 2:30 PM, Alexander Duyck wrote:
quoted
On Sun, Jan 10, 2021 at 7:12 AM Leon Romanovsky [off-list ref] wrote:
quoted
From: Leon Romanovsky <leonro@nvidia.com>

Extend PCI sysfs interface with a new callback that allows configure
the number of MSI-X vectors for specific SR-IO VF. This is needed
to optimize the performance of newly bound devices by allocating
the number of vectors based on the administrator knowledge of targeted VM.

This function is applicable for SR-IOV VF because such devices allocate
their MSI-X table before they will run on the VMs and HW can't guess the
right number of vectors, so the HW allocates them statically and equally.

The newly added /sys/bus/pci/devices/.../vf_msix_vec file will be seen
for the VFs and it is writable as long as a driver is not bounded to the VF.

The values accepted are:
  * > 0 - this will be number reported by the VF's MSI-X capability
  * < 0 - not valid
  * = 0 - will reset to the device default value

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
  Documentation/ABI/testing/sysfs-bus-pci | 20 ++++++++
  drivers/pci/iov.c                       | 62 +++++++++++++++++++++++++
  drivers/pci/msi.c                       | 29 ++++++++++++
  drivers/pci/pci-sysfs.c                 |  1 +
  drivers/pci/pci.h                       |  2 +
  include/linux/pci.h                     |  8 +++-
  6 files changed, 121 insertions(+), 1 deletion(-)
<...>
quoted
quoted
+
This doesn't make sense to me. You are getting the vector count for
the PCI device and reporting that. Are you expecting to call this on
the PF or the VFs? It seems like this should be a PF attribute and not
be called on the individual VFs.

If you are calling this on the VFs then it doesn't really make any
sense anyway since the VF is not a "VF PCI dev representor" and
shouldn't be treated as such. In my opinion if we are going to be
doing per-port resource limiting that is something that might make
more sense as a part of the devlink configuration for the VF since the
actual change won't be visible to an assigned device.
if the op were just limited to nic ports, devlink may be used; but I believe Leon is trying to handle it from an sriov/vf perspective for other non-nic devices as well,
e.g., ib ports, nvme vf's (which don't have a port concept at all).
Right, the SR-IOV VFs are common entities outside of nic/devlink world.
In addition to the netdev world, SR-IOV is used for crypto, storage, FPGA
and IB devices.

From what I see, There are three possible ways to configure MSI-X vector count:
1. PCI device is on the same CPU - regular server/desktop as we know it.
2. PCI device is on remote CPU - SmartNIC use case, the CPUs are
connected through eswitch model.
3. Some direct interface - DEVX for the mlx5_ib.

This implementation handles item #1 for all devices without any exception.
From my point of view, the majority of interested users of this feature
will use this option.

The second item should be solved differently (with devlink) when configuring
eswitch port, but it is orthogonal to the item #1 and I will do it after.

The third item is not managed by the kernel, so not relevant for our discussion.

Thanks
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help