RE: [RFC PATCH v2] uacce: Add uacce_ctrl misc device
From: Song Bao Hua (Barry Song) <hidden>
Date: 2021-02-02 03:48:25
Also in:
linux-iommu, lkml
-----Original Message----- From: Tian, Kevin [mailto:kevin.tian@intel.com] Sent: Tuesday, February 2, 2021 3:52 PM To: Jason Gunthorpe <jgg@ziepe.ca> Cc: Song Bao Hua (Barry Song) <redacted>; chensihang (A) [off-list ref]; Arnd Bergmann [off-list ref]; Greg Kroah-Hartman [off-list ref]; linux-kernel@vger.kernel.org; iommu@lists.linux-foundation.org; linux-mm@kvack.org; Zhangfei Gao [off-list ref]; Liguozhu (Kenneth) [off-list ref]; linux-accelerators@lists.ozlabs.org Subject: RE: [RFC PATCH v2] uacce: Add uacce_ctrl misc devicequoted
From: Jason Gunthorpe <jgg@ziepe.ca> Sent: Tuesday, February 2, 2021 7:44 AM On Fri, Jan 29, 2021 at 10:09:03AM +0000, Tian, Kevin wrote:quoted
quoted
SVA is not doom to work with IO page fault only. If we have SVA+pin, we would get both sharing address and stable I/O latency.Isn't it like a traditional MAP_DMA API (imply pinning) plus specifying cpu_va of the memory pool as the iova?I think their issue is the HW can't do the cpu_va trick without also involving the system IOMMU in a SVA modeThis is the part that I didn't understand. Using cpu_va in a MAP_DMA interface doesn't require device support. It's just an user-specified address to be mapped into the IOMMU page table. On the other hand,
The background is that uacce is based on SVA and we are building applications on uacce: https://www.kernel.org/doc/html/v5.10/misc-devices/uacce.html so IOMMU simply uses the page table of MMU, and don't do any special mapping to an user-specified address. We don't break the basic assumption that uacce is using SVA, otherwise, we need to re-build uacce and the whole base.
sharing CPU page table through a SVA interface for an usage where I/O page faults must be completely avoided seems a misleading attempt.
That is not for completely avoiding IO page fault, that is just an extension for high-performance I/O case, providing a way to avoid IO latency jitter. Using it or not is totally up to users.
Even if people do want this model (e.g. mix pinning+fault), it should be a mm syscall as Greg pointed out, not specific to sva.
We are glad to make it a syscall if people are happy with it. The simplest way would be a syscall similar with userfaultfd if we don't want to mess up mm_struct.
Thanks Kevin
Thanks Barry