Re: [RFC 11/20] iommu/iommufd: Add IOMMU_IOASID_ALLOC/FREE
From: Jason Gunthorpe <jgg@nvidia.com>
Date: 2021-10-01 12:25:14
Also in:
linux-iommu, lkml
On Fri, Oct 01, 2021 at 04:19:22PM +1000, david@gibson.dropbear.id.au wrote:
On Wed, Sep 22, 2021 at 11:09:11AM -0300, Jason Gunthorpe wrote:quoted
On Wed, Sep 22, 2021 at 03:40:25AM +0000, Tian, Kevin wrote:quoted
quoted
From: Jason Gunthorpe <jgg@nvidia.com> Sent: Wednesday, September 22, 2021 1:45 AM On Sun, Sep 19, 2021 at 02:38:39PM +0800, Liu Yi L wrote:quoted
This patch adds IOASID allocation/free interface per iommufd. When allocating an IOASID, userspace is expected to specify the type and format information for the target I/O page table. This RFC supports only one type (IOMMU_IOASID_TYPE_KERNEL_TYPE1V2), implying a kernel-managed I/O page table with vfio type1v2 mapping semantics. For this type the user should specify the addr_width of the I/O address space and whether the I/O page table is created in an iommu enfore_snoop format. enforce_snoop must be true at this point, as the false setting requires additional contract with KVM on handling WBINVD emulation, which can be added later. Userspace is expected to call IOMMU_CHECK_EXTENSION (see next patch) for what formats can be specified when allocating an IOASID. Open: - Devices on PPC platform currently use a different iommu driver in vfio. Per previous discussion they can also use vfio type1v2 as long as there is a way to claim a specific iova range from a system-wide address space. This requirement doesn't sound PPC specific, as addr_width for pcidevicesquoted
can be also represented by a range [0, 2^addr_width-1]. This RFC hasn't adopted this design yet. We hope to have formal alignment in v1discussionquoted
and then decide how to incorporate it in v2.I think the request was to include a start/end IO address hint when creating the ios. When the kernel creates it then it can return theis the hint single-range or could be multiple-ranges?David explained it here: https://lore.kernel.org/kvm/YMrKksUeNW%2FPEGPM@yekko/ (local)Apparently not well enough. I've attempted again in this thread.quoted
qeumu needs to be able to chooose if it gets the 32 bit range or 64 bit range.No. qemu needs to supply *both* the 32-bit and 64-bit range to its guest, and therefore needs to request both from the host.
As I understood your remarks each IOAS can only be one of the formats as they have a different PTE layout. So here I ment that qmeu needs to be able to pick *for each IOAS* which of the two formats it is.
Or rather, it *might* need to supply both. It will supply just the 32-bit range by default, but the guest can request the 64-bit range and/or remove and resize the 32-bit range via hypercall interfaces. Vaguely recent Linux guests certainly will request the 64-bit range in addition to the default 32-bit range.
And this would result in two different IOAS objects Jason