RE: [PATCH V4 05/18] iommu/ioasid: Redefine IOASID set and allocation APIs
From: Tian, Kevin <hidden>
Date: 2021-05-08 07:31:32
Also in:
linux-iommu, lkml
From: Alex Williamson <redacted> Sent: Saturday, May 8, 2021 1:06 AMquoted
quoted
Those are the main ones I can think of. It is nice to have a simple map/unmap interface, I'd hope that a new /dev/ioasid interface wouldn't raise the barrier to entry too high, but the user needs to have the ability to have more control of their mappings and locked page accounting should probably be offloaded somewhere. Thanks,Based on your feedbacks I feel it's probably reasonable to start with a type1v2 semantics for the new interface. Locked accounting could also start with the same VFIO restriction and then improve it incrementally, if a cleaner way is intrusive (if not affecting uAPI). But I didn't get the suggestion on "more control of their mappings". Can you elaborate?Things like I note above, userspace cannot currently specify mapping granularity nor has any visibility to the granularity they get from the IOMMU. What actually happens in the IOMMU is pretty opaque to the user currently. Thanks,
It's much clearer. Based on all the discussions so far I'm thinking about
a staging approach when building the new interface, basically following
the model that Jason pointed out - generic stuff first, then platform
specific extension:
Phase 1: /dev/ioasid with core ingredients and vfio type1v2 semantics
- ioasid is the software handle representing an I/O page table
- uAPI accepts a type1v2 map/unmap semantics per ioasid
- helpers for VFIO/VDPA to bind ioasid_fd and attach ioasids
- multiple ioasids are allowed without nesting (vIOMMU, or devices
w/ incompatible iommu attributes)
- an ioasid disallows any operation before it's attached to a device
- an ioasid inherits iommu attributes from the 1st device attached
to it
- userspace is expected to manage hardware restrictions and the
kernel only returns error when restrictions are broken
* map/unmap on an ioasid will fail before every device in a group
is attached to it
* ioasid attach will fail if the new device has incompatibile iommu
attribute as that of this ioasid
- thus no group semantics in uAPI
- no change to vfio container/group/type1 logic, for running existing
vfio applications
* imply some duplication between vfio type1 and ioasid for some time
- new uAPI in vfio to allow explicit opening of a device and then binding
it to the ioasid_fd
* possibly require each device exposed in /dev/vfio/
- support both pdev and mdev
Phase 2: ioasid nesting
- Allow bind/unbind_pgtable semantics per ioasid
- Allow ioasid nesting
* HW ioasid nesting if supported by platform
* otherwise fall back to SW ioasid nesting (in-kernel shadowing)
- iotlb invalidation per ioasid
- I/O page fault handling per ioasid
- hw_id is not exposed in uAPI. Vendor IOMMU driver decides
when/how hw_id is allocated and programmed properly
Phase3: optimizations and vendor extensions (order undefined, up to
the specific feature owner):
- (Intel) ENQCMD support with hw_id exposure in uAPI
- (ARM/AMD) RID-based pasid table assignment
- (PPC) window-based iova management
- Optimizations:
* replace vfio type1 with a shim driver to use ioasid backend
* mapping granularity
* HW dirty page tracking
* ...
Does above sounds a sensible plan? If yes we'll start working on
phase1 then...
Thanks
Kevin