Re: [PATCH v3 17/35] mm/mmap: write-lock VMA before shrinking or expanding it

[PATCH v3 00/35] Per-VMA locks · Suren Baghdasaryan <surenb@google.com> · 2023-02-16
[PATCH v3 01/35] maple_tree: Be more cautious about dead nodes · Suren Baghdasaryan <surenb@google.com> · 2023-02-16
[PATCH v3 02/35] maple_tree: Detect dead nodes in mas_start() · Suren Baghdasaryan <surenb@google.com> · 2023-02-16
[PATCH v3 03/35] maple_tree: Fix freeing of nodes in rcu mode · Suren Baghdasaryan <surenb@google.com> · 2023-02-16
[PATCH v3 04/35] maple_tree: remove extra smp_wmb() from mas_dead_leaves() · Suren Baghdasaryan <surenb@google.com> · 2023-02-16
[PATCH v3 05/35] maple_tree: Fix write memory barrier of nodes once dead for RCU mode · Suren Baghdasaryan <surenb@google.com> · 2023-02-16
[PATCH v3 06/35] maple_tree: Add smp_rmb() to dead node detection · Suren Baghdasaryan <surenb@google.com> · 2023-02-16
[PATCH v3 07/35] maple_tree: Add RCU lock checking to rcu callback functions · Suren Baghdasaryan <surenb@google.com> · 2023-02-16
[PATCH v3 08/35] mm: Enable maple tree RCU mode by default. · Suren Baghdasaryan <surenb@google.com> · 2023-02-16
[PATCH v3 09/35] mm: introduce CONFIG_PER_VMA_LOCK · Suren Baghdasaryan <surenb@google.com> · 2023-02-16
[PATCH v3 10/35] mm: rcu safe VMA freeing · Suren Baghdasaryan <surenb@google.com> · 2023-02-16
[PATCH v3 11/35] mm: move mmap_lock assert function definitions · Suren Baghdasaryan <surenb@google.com> · 2023-02-16
[PATCH v3 12/35] mm: add per-VMA lock and helper functions to control it · Suren Baghdasaryan <surenb@google.com> · 2023-02-16
[PATCH v3 13/35] mm: mark VMA as being written when changing vm_flags · Suren Baghdasaryan <surenb@google.com> · 2023-02-16
[PATCH v3 14/35] mm/mmap: move VMA locking before vma_adjust_trans_huge call · Suren Baghdasaryan <surenb@google.com> · 2023-02-16
[PATCH v3 15/35] mm/khugepaged: write-lock VMA while collapsing a huge page · Suren Baghdasaryan <surenb@google.com> · 2023-02-16
[PATCH v3 19/35] mm: write-lock VMAs before removing them from VMA tree · Suren Baghdasaryan <surenb@google.com> · 2023-02-16
[PATCH v3 17/35] mm/mmap: write-lock VMA before shrinking or expanding it · Suren Baghdasaryan <surenb@google.com> · 2023-02-16
Re: [PATCH v3 17/35] mm/mmap: write-lock VMA before shrinking or expanding it · Liam R. Howlett <hidden> · 2023-02-23
Re: [PATCH v3 17/35] mm/mmap: write-lock VMA before shrinking or expanding it · Liam R. Howlett <hidden> · 2023-02-23
Re: [PATCH v3 17/35] mm/mmap: write-lock VMA before shrinking or expanding it · Suren Baghdasaryan <surenb@google.com> · 2023-02-23
Re: [PATCH v3 17/35] mm/mmap: write-lock VMA before shrinking or expanding it · Liam R. Howlett <hidden> · 2023-02-24
Re: [PATCH v3 17/35] mm/mmap: write-lock VMA before shrinking or expanding it · Suren Baghdasaryan <surenb@google.com> · 2023-02-24
Re: [PATCH v3 17/35] mm/mmap: write-lock VMA before shrinking or expanding it · Liam R. Howlett <hidden> · 2023-02-24
Re: [PATCH v3 17/35] mm/mmap: write-lock VMA before shrinking or expanding it · Suren Baghdasaryan <surenb@google.com> · 2023-02-24
Re: [PATCH v3 17/35] mm/mmap: write-lock VMA before shrinking or expanding it · Suren Baghdasaryan <surenb@google.com> · 2023-02-27
[PATCH v3 16/35] mm/mmap: write-lock VMAs before merging, splitting or expanding them · Suren Baghdasaryan <surenb@google.com> · 2023-02-16
Re: [PATCH v3 16/35] mm/mmap: write-lock VMAs before merging, splitting or expanding them · Hyeonggon Yoo <hidden> · 2023-02-23
Re: [PATCH v3 16/35] mm/mmap: write-lock VMAs before merging, splitting or expanding them · Hyeonggon Yoo <hidden> · 2023-02-23
Re: [PATCH v3 16/35] mm/mmap: write-lock VMAs before merging, splitting or expanding them · Suren Baghdasaryan <surenb@google.com> · 2023-02-23
[PATCH v3 18/35] mm/mremap: write-lock VMA while remapping it to a new address range · Suren Baghdasaryan <surenb@google.com> · 2023-02-16
[PATCH v3 20/35] mm: conditionally write-lock VMA in free_pgtables · Suren Baghdasaryan <surenb@google.com> · 2023-02-16
[PATCH v3 21/35] mm/mmap: write-lock adjacent VMAs if they can grow into unmapped area · Suren Baghdasaryan <surenb@google.com> · 2023-02-16
Re: [PATCH v3 21/35] mm/mmap: write-lock adjacent VMAs if they can grow into unmapped area · Liam R. Howlett <hidden> · 2023-02-16
Re: [PATCH v3 21/35] mm/mmap: write-lock adjacent VMAs if they can grow into unmapped area · Suren Baghdasaryan <surenb@google.com> · 2023-02-16
Re: [PATCH v3 21/35] mm/mmap: write-lock adjacent VMAs if they can grow into unmapped area · Liam R. Howlett <hidden> · 2023-02-17
Re: [PATCH v3 21/35] mm/mmap: write-lock adjacent VMAs if they can grow into unmapped area · Suren Baghdasaryan <surenb@google.com> · 2023-02-17
[PATCH v3 22/35] kernel/fork: assert no VMA readers during its destruction · Suren Baghdasaryan <surenb@google.com> · 2023-02-16
[PATCH v3 23/35] mm/mmap: prevent pagefault handler from racing with mmu_notifier registration · Suren Baghdasaryan <surenb@google.com> · 2023-02-16
Re: [PATCH v3 23/35] mm/mmap: prevent pagefault handler from racing with mmu_notifier registration · Liam R. Howlett <hidden> · 2023-02-23
Re: [PATCH v3 23/35] mm/mmap: prevent pagefault handler from racing with mmu_notifier registration · Suren Baghdasaryan <surenb@google.com> · 2023-02-23
[PATCH v3 24/35] mm: introduce vma detached flag · Suren Baghdasaryan <surenb@google.com> · 2023-02-16
Re: [PATCH v3 24/35] mm: introduce vma detached flag · Liam R. Howlett <hidden> · 2023-02-23
Re: [PATCH v3 24/35] mm: introduce vma detached flag · Suren Baghdasaryan <surenb@google.com> · 2023-02-23
[PATCH v3 25/35] mm: introduce lock_vma_under_rcu to be used from arch-specific code · Suren Baghdasaryan <surenb@google.com> · 2023-02-16
[PATCH v3 26/35] mm: fall back to mmap_lock if vma->anon_vma is not yet set · Suren Baghdasaryan <surenb@google.com> · 2023-02-16
Re: [PATCH v3 26/35] mm: fall back to mmap_lock if vma->anon_vma is not yet set · Matthew Wilcox <willy@infradead.org> · 2023-02-16
Re: [PATCH v3 26/35] mm: fall back to mmap_lock if vma->anon_vma is not yet set · Suren Baghdasaryan <surenb@google.com> · 2023-02-16
Re: [PATCH v3 26/35] mm: fall back to mmap_lock if vma->anon_vma is not yet set · Suren Baghdasaryan <surenb@google.com> · 2023-02-17
Re: [PATCH v3 26/35] mm: fall back to mmap_lock if vma->anon_vma is not yet set · Hyeonggon Yoo <hidden> · 2023-02-17
Re: [PATCH v3 26/35] mm: fall back to mmap_lock if vma->anon_vma is not yet set · Suren Baghdasaryan <surenb@google.com> · 2023-02-17
Re: [PATCH v3 26/35] mm: fall back to mmap_lock if vma->anon_vma is not yet set · Hyeonggon Yoo <hidden> · 2023-02-17
Re: [PATCH v3 26/35] mm: fall back to mmap_lock if vma->anon_vma is not yet set · Matthew Wilcox <willy@infradead.org> · 2023-02-17
Re: [PATCH v3 26/35] mm: fall back to mmap_lock if vma->anon_vma is not yet set · Suren Baghdasaryan <surenb@google.com> · 2023-02-17
Re: [PATCH v3 26/35] mm: fall back to mmap_lock if vma->anon_vma is not yet set · Matthew Wilcox <willy@infradead.org> · 2023-04-03
[PATCH v3 27/35] mm: add FAULT_FLAG_VMA_LOCK flag · Suren Baghdasaryan <surenb@google.com> · 2023-02-16
[PATCH v3 28/35] mm: prevent do_swap_page from handling page faults under VMA lock · Suren Baghdasaryan <surenb@google.com> · 2023-02-16
[PATCH v3 29/35] mm: prevent userfaults to be handled under per-vma lock · Suren Baghdasaryan <surenb@google.com> · 2023-02-16
[PATCH v3 30/35] mm: introduce per-VMA lock statistics · Suren Baghdasaryan <surenb@google.com> · 2023-02-16
[PATCH v3 31/35] x86/mm: try VMA lock-based page fault handling first · Suren Baghdasaryan <surenb@google.com> · 2023-02-16
[PATCH v3 32/35] arm64/mm: try VMA lock-based page fault handling first · Suren Baghdasaryan <surenb@google.com> · 2023-02-16
[PATCH v3 33/35] powerc/mm: try VMA lock-based page fault handling first · Suren Baghdasaryan <surenb@google.com> · 2023-02-16
[PATCH v3 34/35] mm/mmap: free vm_area_struct without call_rcu in exit_mmap · Suren Baghdasaryan <surenb@google.com> · 2023-02-16
[PATCH v3 35/35] mm: separate vma->lock from vm_area_struct · Suren Baghdasaryan <surenb@google.com> · 2023-02-16
Re: [PATCH v3 00/35] Per-VMA locks · freak07 <hidden> · 2023-02-24
Re: [PATCH v3 00/35] Per-VMA locks · Davidlohr Bueso <dave@stgolabs.net> · 2023-02-27
Re: [PATCH v3 00/35] Per-VMA locks · Suren Baghdasaryan <surenb@google.com> · 2023-02-27

From: Suren Baghdasaryan <surenb@google.com>
Date: 2023-02-24 16:19:41
Also in: linux-mm, lkml

On Fri, Feb 24, 2023 at 8:14 AM Liam R. Howlett [off-list ref] wrote:

* Suren Baghdasaryan [off-list ref] [230223 21:06]:

quoted

On Thu, Feb 23, 2023 at 5:46 PM Liam R. Howlett [off-list ref] wrote:

quoted

* Suren Baghdasaryan [off-list ref] [230223 16:16]:

quoted

On Thu, Feb 23, 2023 at 12:28 PM Liam R. Howlett
[off-list ref] wrote:

quoted


Wait, I figured a better place to do this.

init_multi_vma_prep() should vma_start_write() on any VMA that is passed
in.. that we we catch any modifications here & in vma_merge(), which I
think is missed in this patch set?

Hmm. That looks like a good idea but in that case, why not do the
locking inside vma_prepare() itself? From the description of that
function it sounds like it was designed to acquire locks before VMA
modifications, so would be the ideal location for doing that. WDYT?

That might be even better.  I think it will result in even less code.

Yes.

quoted

There is also a vma_complete() which might work to call
vma_end_write_all() as well?

If there are other VMAs already locked before vma_prepare() then we
would unlock them too. Safer to just let mmap_unlock do
vma_end_write_all().

quoted

The only concern is vma_adjust_trans_huge() being called before
vma_prepare() but I *think* that's safe because
vma_adjust_trans_huge() does its modifications after acquiring PTL
lock, which page fault handlers also have to take. Does that sound
right?

I am not sure.  We are certainly safe the way it is, and the PTL has to
be safe for concurrent faults.. but this could alter the walk to a page
table while that walk is occurring and I don't think that happens today.

It might be best to leave the locking order the way you have it, unless
someone can tell us it's safe?

Yes, I have the same feelings about changing this.

quoted

We could pass through the three extra variables that are needed to move
the vma_adjust_trans_huge() call within that function as well?  This
would have the added benefit of having all locking grouped in the one
location, but the argument list would be getting long, however we could
use the struct.

Any issues if I change the order to have vma_prepare() called always
before vma_adjust_trans_huge()? That way the VMA will always be locked
before vma_adjust_trans_huge() executes and we don't need any
additional arguments.

I preserved the locking order from __vma_adjust() to ensure there was no
issues.

I am not sure but, looking through the page table information [1], it
seems that vma_adjust_trans_huge() uses the pmd lock, which is part of
the split page table lock.  According to the comment in rmap, it should
be fine to reverse the ordering here.

Instead of:

mmap_lock()
vma_adjust_trans_huge()
        pte_lock
        pte_unlock

vma_prepare()
        mapping->i_mmap_rwsem lock
        anon_vma->rwsem lock

<changes to tree/VMAs>

vma_complete()
        anon_vma->rwsem unlock
        mapping->i_mmap_rwsem unlock

mmap_unlock()

---------

We would have:

mmap_lock()
vma_prepare()
        mapping->i_mmap_rwsem lock
        anon_vma->rwsem lock

vma_adjust_trans_huge()
        pte_lock
        pte_unlock

<changes to tree/VMAs>

vma_complete()
        anon_vma->rwsem unlock
        mapping->i_mmap_rwsem unlock

mmap_unlock()


Essentially, increasing the nesting of the pte lock, but not violating
the ordering.

1. https://docs.kernel.org/mm/split_page_table_lock.html

Thanks for the confirmation, Liam. I'll make the changes and test over
the weekend. If everything's still fine, I will post the next version
with these and other requested changes on Monday.

quoted

remove & remove2 should be be detached in vma_prepare() or
vma_complete() as well?

They are marked detached in vma_complete() (see
https://lore.kernel.org/all/20230216051750.3125598-25-surenb@google.com/ (local))
and that should be enough. We should be safe as long as we mark them
detached before unlocking the VMA.

Right, Thanks.

...

--
To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help