Re: [PATCH kernel v5 5/6] vfio/spapr: Reference mm in tce_container
From: Alexey Kardashevskiy <hidden>
Date: 2016-11-22 03:50:01
Also in:
kvm
On 22/11/16 13:38, David Gibson wrote:
On Thu, Nov 17, 2016 at 06:39:41PM +1100, Alexey Kardashevskiy wrote:quoted
On 11/11/16 23:32, Alexey Kardashevskiy wrote:quoted
In some situations the userspace memory context may live longer than the userspace process itself so if we need to do proper memory context cleanup, we better have tce_container take a reference to mm_struct and use it later when the process is gone (@current or @current->mm is NULL). This references mm and stores the pointer in the container; this is done in a new helper - tce_iommu_mm_set() - when one of the following happens: - a container is enabled (IOMMU v1); - a first attempt to pre-register memory is made (IOMMU v2); - a DMA window is created (IOMMU v2). The @mm stays referenced till the container is destroyed. This replaces current->mm with container->mm everywhere except debug prints. This adds a check that current->mm is the same as the one stored in the container to prevent userspace from making changes to a memory context of other processes. DMA map/unmap ioctls() do not check for @mm as they already check for @enabled which is set after tce_iommu_mm_set() is called. Signed-off-by: Alexey Kardashevskiy <redacted> --- Changes: v5: * postpone referencing of mm v4: * added check for container->mm!=current->mm in tce_iommu_ioctl() for all ioctls and removed other redundand checks --- drivers/vfio/vfio_iommu_spapr_tce.c | 159 ++++++++++++++++++++++-------------- 1 file changed, 99 insertions(+), 60 deletions(-)diff --git a/drivers/vfio/vfio_iommu_spapr_tce.c b/drivers/vfio/vfio_iommu_spapr_tce.c index 1c02498..9a81a7e 100644 --- a/drivers/vfio/vfio_iommu_spapr_tce.c +++ b/drivers/vfio/vfio_iommu_spapr_tce.c@@ -31,49 +31,49 @@ static void tce_iommu_detach_group(void *iommu_data, struct iommu_group *iommu_group); -static long try_increment_locked_vm(long npages) +static long try_increment_locked_vm(struct mm_struct *mm, long npages) { long ret = 0, locked, lock_limit; - if (!current || !current->mm) - return -ESRCH; /* process exited */ + if (!mm) + return -EPERM; if (!npages) return 0; - down_write(¤t->mm->mmap_sem); - locked = current->mm->locked_vm + npages; + down_write(&mm->mmap_sem); + locked = mm->locked_vm + npages; lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT; if (locked > lock_limit && !capable(CAP_IPC_LOCK))Oh boy. Now it seems I have to reference a task, not just mm (which I may not have to reference at all after all as the task reference should keep mm alive) as I missed the fact capable() and rlimit() are working with @current.Sorry, what? I'm not seeing how a task reference comes into this.
I reference @mm to make sure that just one mm uses a container. If mm changes, I return an error, sanity check. The code also increments locked_vm in mm. But it looks at the current task if there is room for increments and for CAP_IPC_LOCK. So, the options are: 1. I do not reference the current task, and if mm changes, then the mm sanity check won't let me proceed to the code which tries using current OR 2. reference a task when I reference mm and do that sanity check not just for mm but also for current task. Makes sense?
quoted
Alex, Is there anything else I should fix before posting v6? Thanks.
-- Alexey
Attachments
- signature.asc [application/pgp-signature] 839 bytes