Thread (268 messages) 268 messages, 15 authors, 2021-06-08

RE: [PATCH V4 05/18] iommu/ioasid: Redefine IOASID set and allocation APIs

From: Tian, Kevin <hidden>
Date: 2021-05-12 00:21:37
Also in: linux-iommu, lkml

From: Jason Gunthorpe <redacted>
Sent: Wednesday, May 12, 2021 7:40 AM

On Tue, May 11, 2021 at 10:51:40PM +0000, Tian, Kevin wrote:
quoted
quoted
From: Jason Gunthorpe <redacted>
Sent: Tuesday, May 11, 2021 10:39 PM

On Tue, May 11, 2021 at 09:10:03AM +0000, Tian, Kevin wrote:
quoted
3) SRIOV, ENQCMD (Intel):
	- "PASID global" with host-allocated PASIDs;
	- PASID table managed by host (in HPA space);
	- all RIDs bound to this ioasid_fd use the global pool;
	- however, exposing global PASID into guest breaks migration;
	- hybrid scheme: split local PASID range and global PASID range;
	- force guest to use only local PASID range (through vIOMMU);
	- for ENQCMD, configure CPU to translate local->global;
	- for non-ENQCMD, setup both local/global pasid entries;
	- uAPI for range split and CPU pasid mapping:

    // set to "PASID global"
    ioctl(ioasid_fd, IOASID_SET_HWID_MODE, IOASID_HWID_GLOBAL);

    // split local/global range, applying to all RIDs in this fd
    // Example: local [0, 1024), global [1024, max)
    // local PASID range is managed by guest and migrated as VM state
    // global PASIDs are re-allocated and mapped to local PASIDs post
migration
quoted
    ioctl(ioasid_fd, IOASID_HWID_SET_GLOBAL_MIN, 1024);
I'm still not sold that ranges are the best idea here, it just adds
more state that has to match during migration. Keeping the
global/local split per RID seems much cleaner to me
With ENQCMD the PASID is kept in CPU MSR, making it a process
context within the guest. When a guest process is bound to two
devices, the same local PASID must be usable on both devices.
Having per RID split cannot guarantee it.
That is only for ENQCMD. All drivers know if they are ENQCMD
compatible drivers and can ensure they use the global allocator
consistently for their RIDs.

Basically each RID knows based on its kernel drivers if it is a local
or global RID and the ioasid knob can further fine tune this for any
other specialty cases.
It's fine if you insist on this way. Then we leave it to userspace to
ensure same split range is used across devices when vIOMMU is
concerned. Please note such range split has to be enforced through
vIOMMU which (e.g. on VT-d) includes a register to report available
PASID space size (applying to all devices behind this vIOMMU) to 
the guest. The kernel just follows per-RID split info. If anything broken,
the userspace just shoots its own foot.
quoted
quoted
It does need some user visible difference because SIOV/mdev is not
migratable. Only the kernel can select a PASID, userspace (and hence
the guest) shouldn't have the option to force a specific PASID as the
PASID space is shared across the entire RID to all VMs using the mdev.
not migratable only when you choose exposing host-allocated PASID
into guest. However in the entire this proposal we actually virtualize
PASIDs, letting the guest manage its own PASID space in all
scenarios
PASID cannot be virtualized without also using ENQCMD.

A mdev that is using PASID without ENQCMD is non-migratable and this
needs to be make visiable in the uAPI.
No. without ENQCMD the PASID must be programmed to a mdev MMIO
register. This operation is mediated then mdev driver can translate the
PASID from virtual to real.

Thanks
Kevin
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help