Thread (233 messages) 233 messages, 15 authors, 2021-10-28

RE: [RFC] /dev/ioasid uAPI proposal

From: "Tian, Kevin" <kevin.tian@intel.com>
Date: 2021-06-04 06:08:36
Also in: linux-iommu, lkml

From: Jason Gunthorpe <jgg@nvidia.com>
Sent: Thursday, June 3, 2021 8:11 PM

On Thu, Jun 03, 2021 at 03:45:09PM +1000, David Gibson wrote:
quoted
On Wed, Jun 02, 2021 at 01:58:38PM -0300, Jason Gunthorpe wrote:
quoted
On Wed, Jun 02, 2021 at 04:48:35PM +1000, David Gibson wrote:
quoted
quoted
quoted
	/* Bind guest I/O page table  */
	bind_data = {
		.ioasid	= gva_ioasid;
		.addr	= gva_pgtable1;
		// and format information
	};
	ioctl(ioasid_fd, IOASID_BIND_PGTABLE, &bind_data);
Again I do wonder if this should just be part of alloc_ioasid. Is
there any reason to split these things? The only advantage to the
split is the device is known, but the device shouldn't impact
anything..
I'm pretty sure the device(s) could matter, although they probably
won't usually.
It is a bit subtle, but the /dev/iommu fd itself is connected to the
devices first. This prevents wildly incompatible devices from being
joined together, and allows some "get info" to report the capability
union of all devices if we want to do that.
Right.. but I've not been convinced that having a /dev/iommu fd
instance be the boundary for these types of things actually makes
sense.  For example if we were doing the preregistration thing
(whether by child ASes or otherwise) then that still makes sense
across wildly different devices, but we couldn't share that layer if
we have to open different instances for each of them.
It is something that still seems up in the air.. What seems clear for
/dev/iommu is that it
 - holds a bunch of IOASID's organized into a tree
 - holds a bunch of connected devices
 - holds a pinned memory cache

One thing it must do is enforce IOMMU group security. A device cannot
be attached to an IOASID unless all devices in its IOMMU group are
part of the same /dev/iommu FD.

The big open question is what parameters govern allowing devices to
connect to the /dev/iommu:
 - all devices can connect and we model the differences inside the API
   somehow.
I prefer to this option if no significant block ahead. 
 - Only sufficiently "similar" devices can be connected
 - The FD's capability is the minimum of all the connected devices

There are some practical problems here, when an IOASID is created the
kernel does need to allocate a page table for it, and that has to be
in some definite format.

It may be that we had a false start thinking the FD container should
be limited. Perhaps creating an IOASID should pass in a list
of the "device labels" that the IOASID will be used with and that can
guide the kernel what to do?
In Qemu case the problem is that it doesn't know the list of devices
that will be attached to an IOASID when it's created. This is a guest-
side knowledge which is conveyed one device at a time to Qemu 
though vIOMMU.

I feel it's fair to say that before user wants to create an IOASID he
should already check the format information about the device which
is intended to be attached right after then when creating the IOASID
the user should specify a format compatible to the device. There is 
format check when IOASID is created, since its I/O page table is not
installed to the IOMMU yet. Later when the intended device is attached
to this IOASID, then verify the format and fail the attach request if
incompatible.

Thanks
Kevin
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help