Re: [PATCH V4 05/18] iommu/ioasid: Redefine IOASID set and allocation APIs
From: Jacob Pan <hidden>
Date: 2021-03-24 19:03:54
Also in:
linux-iommu, lkml
Hi Jason, On Mon, 22 Mar 2021 09:03:00 -0300, Jason Gunthorpe [off-list ref] wrote:
On Fri, Mar 19, 2021 at 11:22:21AM -0700, Jacob Pan wrote:quoted
Hi Jason, On Fri, 19 Mar 2021 10:54:32 -0300, Jason Gunthorpe [off-list ref] wrote:quoted
On Fri, Mar 19, 2021 at 02:41:32PM +0100, Jean-Philippe Brucker wrote:quoted
On Fri, Mar 19, 2021 at 09:46:45AM -0300, Jason Gunthorpe wrote:quoted
On Fri, Mar 19, 2021 at 10:58:41AM +0100, Jean-Philippe Brucker wrote:quoted
Although there is no use for it at the moment (only two upstream users and it looks like amdkfd always uses current too), I quite like the client-server model where the privileged process does bind() and programs the hardware queue on behalf of the client process.This creates a lot complexity, how do does process A get a secure reference to B? How does it access the memory in B to setup the HW?mm_access() for example, and passing addresses via IPCI'd rather the source process establish its own PASID and then pass the rights to use it to some other process via FD passing than try to go the other way. There are lots of security questions with something like mm_access.Thank you all for the input, it sounds like we are OK to remove mm argument from iommu_sva_bind_device() and iommu_sva_alloc_pasid() for now? Let me try to summarize PASID allocation as below: Interfaces | Usage | Limit | bind¹ |User visible /dev/ioasid² | G-SVA/IOVA | cgroup | No |Yes char dev³ | SVA | cgroup | Yes |No iommu driver | default PASID| no | No |No kernel | super SVA | no | yes |No ¹ Allocated during SVA bind ² PASIDs allocated via /dev/ioasid are not bound to any mm. But its ownership is assigned to the process that does the allocation.What does "not bound to a mm" mean?
I meant, the IOASID allocated via /dev/ioasid is in a clean state (just a number). It's initial state is not bound to an mm. Unlike, sva_bind_device() where the IOASID is allocated during bind time. The use case is to support guest SVA bind, where allocation and bind are in two separate steps.
IMHO a use created PASID is either bound to a mm (current) at creation time, or it will never be bound to a mm and its page table is under user control via /dev/ioasid.
True for PASID used in native SVA bind. But for binding with a guest mm, PASID is allocated first (VT-d virtual cmd interface Spec 10.4.44), the bind with the host IOMMU when vIOMMU PASID cache is invalidated. Our intention is to have two separate interfaces: 1. /dev/ioasid (allocation/free only) 2. /dev/sva (handles all SVA related activities including page tables)
I thought the whole point of something like a /dev/ioasid was to get away from each and every device creating its own PASID interface?
yes, but only for the use cases that need to expose PASID to the userspace. AFAICT, the cases are: 1. guest SVA (bind guest mm) 2. full PF/VF assignment(not mediated) where guest driver want to program the actual PASID onto the device.
It maybe somewhat reasonable that some devices could have some easy 'make a SVA PASID on current' interface built in,
I agree, this is the case PASID is hidden from the userspace, right? e.g. uacce.
but anything more complicated should use /dev/ioasid, and anything consuming PASID should also have an API to import and attach a PASID from /dev/ioasid.
Would the above two use cases constitute the "complicated" criteria? Or we should say anything that need the explicit PASID value has to through /dev/ioasid? Could you give some highlevel hint on the APIs that hook up IOASID allocated from /dev/ioasid and use cases that combine device and domain information? Yi is working on /dev/sva RFC, it would be good to have a direction check.
quoted
Currently, the proposed /dev/ioasid interface does not map individual PASID with an FD. The FD is at the ioasid_set granularity and bond to the current mm. We could extend the IOCTLs to cover individual PASID-FD passing case when use cases arise. Would this work?Is it a good idea that the FD is per ioasid_set ?
We were thinking the allocation IOCTL is on a per set basis, then we know the ownership of between PASIDs and its set. If per PASID FD is needed, we can extend.
What is the set used for?
I tried to document the concept in https://lore.kernel.org/lkml/1614463286-97618-2-git-send-email-jacob.jun.pan@linux.intel.com/ (local) In terms of usage for guest SVA, an ioasid_set is mostly tied to a host mm, the use case is as the following: 1. Identify a pool of PASIDs for permission checking (below to the same VM), e.g. only allow SVA binding for PASIDs allocated from the same set. 2. Allow different PASID-aware kernel subsystems to associate, e.g. KVM, device drivers, and IOMMU driver. i.e. each KVM instance only cares about the ioasid_set associated with the VM. Events notifications are also within the ioasid_set to synchronize PASID states. 3. Guest-Host PASID look up (each set has its own XArray to store the mapping) 4. Quota control (going away once we have cgroup)
Usually kernel interfaces work nicer with a one fd/one object model. But even if it is a set, you could pass the set between co-operating processes and the PASID can be created in the correct 'current'. But there is all kinds of security questsions as soon as you start doing anything like this - is there really a use case?
We don't see a use case for passing ioasid_set to another process. All the four use cases above are for the current process.
Jason
Thanks, Jacob _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu