Thread (55 messages) 55 messages, 10 authors, 2021-08-17

Re: [PATCH 09/12] PCI: Add a PCI_ID_F_VFIO_DRIVER_OVERRIDE flag to struct pci_device_id

From: Jason Gunthorpe <jgg@nvidia.com>
Date: 2021-07-27 17:15:03
Also in: kvm, linux-doc, linux-kbuild, linux-pci

On Tue, Jul 27, 2021 at 10:34:18AM -0600, Alex Williamson wrote:
On Wed, 21 Jul 2021 19:16:06 +0300
Yishai Hadas [off-list ref] wrote:
quoted
From: Max Gurtovoy <mgurtovoy@nvidia.com>

The new flag field is be used to allow PCI drivers to signal the core code
during driver matching and when generating the modules.alias information.

The first use will be to define a VFIO flag that indicates the PCI driver
is a VFIO driver.

VFIO drivers have a few special properties compared to normal PCI drivers:
 - They do not automatically bind. VFIO drivers are used to swap out the
   normal driver for a device and convert the PCI device to the VFIO
   subsystem.

   The admin must make this choice and following the current uAPI this is
   usually done by using the driver_override sysfs.

 - The modules.alias includes the IDs of the VFIO PCI drivers, prefixing
   them with 'vfio_pci:' instead of the normal 'pci:'.

   This allows the userspace machinery that switches devices to VFIO to
   know what kernel drivers support what devices and allows it to trigger
   the proper device_override.

As existing tools do not recognize the "vfio_pci:" mod-alias prefix this
keeps todays behavior the same. VFIO remains on the side, is never
autoloaded and can only be activated by direct admin action.

This patch is the infrastructure to provide the information in the
modules.alias to userspace and enable the only PCI VFIO driver. Later
series introduce additional HW specific VFIO PCI drivers.
I don't really understand why we're combining the above "special
properties" into a single flag. 
Currently I can't think of any reason to have two flags. We always
need both behaviors together. It is trivial for someone to change down
the road, so I prefer to keep the flag bit usage to a minimum.
For instance, why wouldn't we create a flag that just indicates a
match entry is only for driver override?
We still need to signal the generation of vfio_pci: string in the
modules.alias.
Or if we're only using this for full wildcard matches, we could
detect that even without a flag.
The mlx/hns/etc drivers will not use wildcard matches. This series is
the prep and the only driver we have right at this point is the
wildcard vfio_pci generic driver.
Then, how does the "vfio_pci:" alias extend to other drivers?  
After the HW drivers are merged we have a list of things in the
modules.alias file. Eg we might have something like:

alias vfio_pci:v000015B3d00001011sv*sd*bc*sc*i* mlx5_vfio_pci
alias vfio_pci:v0000abc1d0000abcdsv*sd*bc*sc*i* hns_vfio_pci
alias vfio_pci:v*d*sv*sd*bc*sc*i* vfio_pci

This flag, and the vfio_pci string, is only for the VFIO subsystem. If
someday another subsystem wants to use driver_override then it will
provide its own subsystem name here instead.

This is solving the problem you had at the start - that userspace must
be able to self identify the drivers.  Starting with a PCI BDF
userspace can match the modules.alias for vfio_pci: prefixes and
determine which string to put into the driver_override sysfs. This is
instead of having userspace hardwire vfio_pci.
Is this expected to be the only driver that would use an alias ever
or would other drivers use new bits of the flag?
Not sure what you mean by "only driver"? As above every driver
implementing VFIO on top of PCI will use this flag. If another
subsystem wants to use driver_override it will define its own flag,
and it's userspace will look for othersubsytem_pci: tags in
modules.alias when it wants to change a PCI device over.
Seems some documentation is necessary; the comment on
PCI_DRIVER_OVERRIDE_DEVICE_VFIO doesn't really help, "This macro is
used to create a struct pci_device_id that matches a specific
device", then we proceed to use it with PCI_ANY_ID.
Fair enough, this is ment in the broader context, the generic vfio_pci
is just special.
vfio-pci has always tried (as much as possible) to be "just another
PCI" driver to avoid all the nasty issues that used to exist with
legacy KVM device assignment, so I cringe at seeing these vfio specific
hooks in PCI-core.  Thanks,
It is has always had very special behavior - a PCI driver without a
match table is is not "just another PCI" driver.

While this is not entirely elegant, considering where we have ended up
and the historical ABI that has to be preserved, it is the best idea
so far anyone has presented.

Jason
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help