Thread (67 messages) 67 messages, 6 authors, 2023-04-03

Re: [PATCH v3 21/35] mm/mmap: write-lock adjacent VMAs if they can grow into unmapped area

From: Liam R. Howlett <hidden>
Date: 2023-02-17 14:52:59
Also in: linux-arm-kernel, linux-mm, lkml

* Suren Baghdasaryan [off-list ref] [230216 14:36]:
On Thu, Feb 16, 2023 at 7:34 AM Liam R. Howlett [off-list ref] wrote:
quoted

First, sorry I didn't see this before v3..
Feedback at any time is highly appreciated!
quoted
* Suren Baghdasaryan [off-list ref] [230216 00:18]:
quoted
While unmapping VMAs, adjacent VMAs might be able to grow into the area
being unmapped. In such cases write-lock adjacent VMAs to prevent this
growth.

Signed-off-by: Suren Baghdasaryan <surenb@google.com>
---
 mm/mmap.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/mm/mmap.c b/mm/mmap.c
index 118b2246bba9..00f8c5798936 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2399,11 +2399,13 @@ do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
       * down_read(mmap_lock) and collide with the VMA we are about to unmap.
       */
      if (downgrade) {
-             if (next && (next->vm_flags & VM_GROWSDOWN))
+             if (next && (next->vm_flags & VM_GROWSDOWN)) {
+                     vma_start_write(next);
                      downgrade = false;
If the mmap write lock is insufficient to protect us from next/prev
modifications then we need to move *most* of this block above the maple
tree write operation, otherwise we have a race here.  When I say most, I
mean everything besides the call to mmap_write_downgrade() needs to be
moved.
Which prior maple tree write operation are you referring to? I see
__split_vma() and munmap_sidetree() which both already lock the VMAs
they operate on, so page faults can't happen in those VMAs.
The write that removes the VMAs from the maple tree a few lines above..
/* Point of no return */

If the mmap lock is not sufficient, then we need to move the
vma_start_write() of prev/next to above the call to
vma_iter_clear_gfp() in do_vmi_align_munmap().

But I still think it IS enough.
quoted
If the mmap write lock is sufficient to protect us from next/prev
modifications then we don't need to write lock the vmas themselves.
mmap write lock is not sufficient because with per-VMA locks we do not
take mmap lock at all.
Understood, but it also does not expand VMAs.
quoted
I believe this is for expand_stack() protection, so I believe it's okay
to not vma write lock these vmas.. I don't think there are other areas
where we can modify the vmas without holding the mmap lock, but others
on the CC list please chime in if I've forgotten something.

So, if I am correct, then you shouldn't lock next/prev and allow the
vma locking fault method on these vmas.  This will work because
lock_vma_under_rcu() uses mas_walk() on the faulting address.  That is,
your lock_vma_under_rcu() will fail to find anything that needs to be
grown and go back to mmap lock protection.  As it is written today, the
vma locking fault handler will fail and we will wait for the mmap lock
to be released even when the vma isn't going to expand.
So, let's consider a case when the next VMA is not being removed (so
it was neither removed nor locked by munmap_sidetree()) and it is
found by lock_vma_under_rcu() in the page fault handling path.
By this point next VMA is either NULL or outside the munmap area, so
what you said here is always true.
Page
fault handler can now expand it and push into the area we are
unmapping in unmap_region(). That is the race I'm trying to prevent
here by locking the next/prev VMAs which can be expanded before
unmap_region() unmaps them. Am I missing something?
Yes, I think the part you are missing (or I am missing..) is that
expand_stack() will never be called without the mmap lock.  We don't use
the vma locking to expand the stack.

...
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help