Thread (268 messages) 268 messages, 15 authors, 2021-06-08

Re: [PATCH V4 05/18] iommu/ioasid: Redefine IOASID set and allocation APIs

From: David Gibson <hidden>
Date: 2021-05-27 07:12:42
Also in: linux-iommu, lkml

On Mon, May 24, 2021 at 08:37:44PM -0300, Jason Gunthorpe wrote:
On Mon, May 24, 2021 at 05:52:58PM +1000, David Gibson wrote:
quoted
quoted
quoted
I don't really see a semantic distinction between "always one-device
groups" and "groups don't matter".  Really the only way you can afford
to not care about groups is if they're singletons.
The kernel driver under the mdev may not be in an "always one-device"
group.
I don't really understand what you mean by that.
I mean the group of the mdev's actual DMA device may have multiple
things in it.
 
quoted
quoted
It is a kernel driver so the only thing we know and care about is that
all devices in the HW group are bound to kernel drivers.

The vfio device that spawns from this kernel driver is really a
"groups don't matter" vfio device because at the IOMMU layer it should
be riding on the physical group of the kernel driver.  At the VFIO
layer we no longer care about the group abstraction because the system
guarentees isolation in some other way.
Uh.. I don't really know how mdevs are isolated from each other.  I
thought it was because the physical device providing the mdevs
effectively had an internal IOMMU (or at least DMA permissioning) to
isolate the mdevs, even though the physical device may not be fully
isolated.

In that case the virtual mdev is effectively in a singleton group,
which is different from the group of its parent device.
That is one way to view it, but it means creating a whole group
infrastructure and abusing the IOMMU stack just to create this
nonsense fiction.
It's a nonsense fiction until it's not, at which point it will bite
you in the arse.
We also abuse the VFIO container stuff to hackily
create several different types pf IOMMU uAPIs for the mdev - all of
which are unrelated to drivers/iommu.

Basically, there is no drivers/iommu thing involved, thus is no really
iommu group, for mdev it is all a big hacky lie.
Well, "iommu" group might not be the best name, but hardware isolation
is still a real concern here, even if it's not entirely related to the
IOMMU.
quoted
If the physical device had a bug which meant the mdevs *weren't*
properly isolated from each other, then those mdevs would share a
group, and you *would* care about it.  Depending on how the isolation
failed the mdevs might or might not also share a group with the parent
physical device.
That isn't a real scenario.. mdevs that can't be isolated just
wouldn't be useful to exist
Really?  So what do you do when you discover some mdevs you thought
were isolated actually aren't due to a hardware bug?  Drop support
from the driver entirely?  In which case what do you say to the people
who understandably complain "but... we had all the mdevs in one guest
anyway, we don't care if they're not isolated"?
quoted
quoted
This is today's model, yes. When you run dpdk on a multi-group device
vfio already ensures that all the device groups remained parked and
inaccessible.
I'm not really following what you're saying there.

If you have a multi-device group, and dpdk is using one device in it,
VFIO *does not* (and cannot) ensure that other devices in the group
are parked and inaccessible.  
I mean in the sense that no other user space can open those devices
and no kernel driver can later be attached to them.
Ok.
quoted
It ensures that they're parked at the moment the group moves from
kernel to userspace ownership, but it can't prevent dpdk from
accessing and unparking those devices via peer to peer DMA.
Right, and adding all this group stuff did nothing to alert the poor
admin that is running DPDK to this risk.
Didn't it?  Seems to me the admin that in order to give the group to
DPDK, the admin had to find and unbind all the things in it... so is
therefore aware that they're giving everything in it to DPDK.
quoted
quoted
If the administator configures the system with different security
labels for different VFIO devices then yes removing groups makes this
more tricky as all devices in the group should have the same label.
That seems a bigger problem than "more tricky".  How would you propose
addressing this with your device-first model?
You put the same security labels you'd put on the group to the devices
that consitute the group. It is only more tricky in the sense that the
script that would have to do this will need to do more than ID the
group to label but also ID the device members of the group and label
their char nodes.
Well, I guess, if you take the view that root is allowed to break the
kernel.  I tend to prefer that although root can obviously break the
kernel if they intend do, we should make it hard to do by accident -
which in this case would mean the kernel *enforcing* that the devices
in the group have the same security labels, which I can't really see
how to do without an exposed group.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help