Thread (19 messages) 19 messages, 3 authors, 2016-11-24

Re: [PATCH kernel v5 5/6] vfio/spapr: Reference mm in tce_container

From: Alexey Kardashevskiy <hidden>
Date: 2016-11-22 03:50:01
Also in: kvm

On 22/11/16 13:38, David Gibson wrote:
On Thu, Nov 17, 2016 at 06:39:41PM +1100, Alexey Kardashevskiy wrote:
quoted
On 11/11/16 23:32, Alexey Kardashevskiy wrote:
quoted
In some situations the userspace memory context may live longer than
the userspace process itself so if we need to do proper memory context
cleanup, we better have tce_container take a reference to mm_struct and
use it later when the process is gone (@current or @current->mm is NULL).

This references mm and stores the pointer in the container; this is done
in a new helper - tce_iommu_mm_set() - when one of the following happens:
- a container is enabled (IOMMU v1);
- a first attempt to pre-register memory is made (IOMMU v2);
- a DMA window is created (IOMMU v2).
The @mm stays referenced till the container is destroyed.

This replaces current->mm with container->mm everywhere except debug
prints.

This adds a check that current->mm is the same as the one stored in
the container to prevent userspace from making changes to a memory
context of other processes.

DMA map/unmap ioctls() do not check for @mm as they already check
for @enabled which is set after tce_iommu_mm_set() is called.

Signed-off-by: Alexey Kardashevskiy <redacted>
---
Changes:
v5:
* postpone referencing of mm

v4:
* added check for container->mm!=current->mm in tce_iommu_ioctl()
for all ioctls and removed other redundand checks
---
 drivers/vfio/vfio_iommu_spapr_tce.c | 159 ++++++++++++++++++++++--------------
 1 file changed, 99 insertions(+), 60 deletions(-)
diff --git a/drivers/vfio/vfio_iommu_spapr_tce.c b/drivers/vfio/vfio_iommu_spapr_tce.c
index 1c02498..9a81a7e 100644
--- a/drivers/vfio/vfio_iommu_spapr_tce.c
+++ b/drivers/vfio/vfio_iommu_spapr_tce.c
@@ -31,49 +31,49 @@
 static void tce_iommu_detach_group(void *iommu_data,
 		struct iommu_group *iommu_group);
 
-static long try_increment_locked_vm(long npages)
+static long try_increment_locked_vm(struct mm_struct *mm, long npages)
 {
 	long ret = 0, locked, lock_limit;
 
-	if (!current || !current->mm)
-		return -ESRCH; /* process exited */
+	if (!mm)
+		return -EPERM;
 
 	if (!npages)
 		return 0;
 
-	down_write(&current->mm->mmap_sem);
-	locked = current->mm->locked_vm + npages;
+	down_write(&mm->mmap_sem);
+	locked = mm->locked_vm + npages;
 	lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
 	if (locked > lock_limit && !capable(CAP_IPC_LOCK))


Oh boy. Now it seems I have to reference a task, not just mm (which I may
not have to reference at all after all as the task reference should keep mm
alive) as I missed the fact capable() and rlimit() are working with
@current.
Sorry, what?  I'm not seeing how a task reference comes into this.
I reference @mm to make sure that just one mm uses a container. If mm
changes, I return an error, sanity check.

The code also increments locked_vm in mm. But it looks at the current task
if there is room for increments and for CAP_IPC_LOCK.

So, the options are:
1. I do not reference the current task, and if mm changes, then the mm
sanity check won't let me proceed to the code which tries using current OR
2. reference a task when I reference mm and do that sanity check not just
for mm but also for current task.

Makes sense?

quoted

Alex,

Is there anything else I should fix before posting v6? Thanks.


-- 
Alexey

Attachments

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help