Thread (268 messages) 268 messages, 15 authors, 2021-06-08

Re: [PATCH V4 05/18] iommu/ioasid: Redefine IOASID set and allocation APIs

From: David Gibson <hidden>
Date: 2021-05-24 07:56:15
Also in: linux-iommu, lkml

On Thu, May 13, 2021 at 10:59:38AM -0300, Jason Gunthorpe wrote:
On Thu, May 13, 2021 at 03:48:19PM +1000, David Gibson wrote:
quoted
On Mon, May 03, 2021 at 01:15:18PM -0300, Jason Gunthorpe wrote:
quoted
On Thu, Apr 29, 2021 at 01:04:05PM +1000, David Gibson wrote:
quoted
Again, I don't know enough about VDPA to make sense of that.  Are we
essentially talking non-PCI virtual devices here?  In which case you
could define the VDPA "bus" to always have one-device groups.
It is much worse than that.

What these non-PCI devices need is for the kernel driver to be part of
the IOMMU group of the underlying PCI device but tell VFIO land that
"groups don't matter"
I don't really see a semantic distinction between "always one-device
groups" and "groups don't matter".  Really the only way you can afford
to not care about groups is if they're singletons.
The kernel driver under the mdev may not be in an "always one-device"
group.
I don't really understand what you mean by that.
It is a kernel driver so the only thing we know and care about is that
all devices in the HW group are bound to kernel drivers.

The vfio device that spawns from this kernel driver is really a
"groups don't matter" vfio device because at the IOMMU layer it should
be riding on the physical group of the kernel driver.  At the VFIO
layer we no longer care about the group abstraction because the system
guarentees isolation in some other way.
Uh.. I don't really know how mdevs are isolated from each other.  I
thought it was because the physical device providing the mdevs
effectively had an internal IOMMU (or at least DMA permissioning) to
isolate the mdevs, even though the physical device may not be fully
isolated.

In that case the virtual mdev is effectively in a singleton group,
which is different from the group of its parent device.

If the physical device had a bug which meant the mdevs *weren't*
properly isolated from each other, then those mdevs would share a
group, and you *would* care about it.  Depending on how the isolation
failed the mdevs might or might not also share a group with the parent
physical device.
The issue is a software one of tightly coupling IOMMU HW groups to
VFIO's API and then introducing an entire class of VFIO mdev devices
that no longer care about IOMMU HW groups at all.
The don't necessarily care about the IOMMU groups of the parent
physical hardware, but they have their own IOMMU groups as virtual
hardware devices.
Currently mdev tries to trick this by creating singleton groups, but
it is very ugly and very tightly coupled to a specific expectation of
the few existing mdev drivers. Trying to add PASID made it alot worse.
quoted
Aside: I'm primarily using "group" to mean the underlying hardware
unit, not the vfio construct on top of it, I'm not sure that's been
clear throughout.
Sure, that is obviously fixed, but I'm not interested in that.

I'm interested in having a VFIO API that makes sense for vfio-pci
which has a tight coupling to the HW notion of a IOMMU and also vfio
mdev's that have no concept of a HW IOMMU group.
quoted
So.. your model assumes that every device has a safe quiescent state
where it won't do any harm until poked, whether its group is
currently kernel owned, or owned by a userspace that doesn't know
anything about it.
This is today's model, yes. When you run dpdk on a multi-group device
vfio already ensures that all the device groups remained parked and
inaccessible.
I'm not really following what you're saying there.

If you have a multi-device group, and dpdk is using one device in it,
VFIO *does not* (and cannot) ensure that other devices in the group
are parked and inaccessible.  It ensures that they're parked at the
moment the group moves from kernel to userspace ownership, but it
can't prevent dpdk from accessing and unparking those devices via peer
to peer DMA.
quoted
At minimum this does mean that in order to use one device in the group
you must have permission to use *all* the devices in the group -
otherwise you may be able to operate a device you don't have
permission to by DMAing to its registers from a device you do have
permission to.
If the administator configures the system with different security
labels for different VFIO devices then yes removing groups makes this
more tricky as all devices in the group should have the same label.
That seems a bigger problem than "more tricky".  How would you propose
addressing this with your device-first model?

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachments

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help