Re: [PATCH v4] mm: improve mprotect(R|W) efficiency on pages referenced once

From: Peter Xu <peterx@redhat.com>
Date: 2021-05-28 12:32:33

On Thu, May 27, 2021 at 06:35:42PM -0700, Peter Collingbourne wrote:

On Thu, May 27, 2021 at 6:21 PM Peter Xu [off-list ref] wrote:

quoted

On Thu, May 27, 2021 at 04:37:11PM -0700, Peter Collingbourne wrote:

quoted

On Thu, May 27, 2021 at 2:41 PM Peter Xu [off-list ref] wrote:

quoted

Peter,

On Thu, May 27, 2021 at 12:04:53PM -0700, Peter Collingbourne wrote:

[...]

quoted

+static bool may_avoid_write_fault(pte_t pte, struct vm_area_struct *vma,
+                               unsigned long cp_flags)
+{
+     if (!(cp_flags & MM_CP_DIRTY_ACCT)) {
+             if (!(vma_is_anonymous(vma) && (vma->vm_flags & VM_WRITE)))
+                     return false;
+
+             if (page_count(pte_page(pte)) != 1)
+                     return false;
+     }

Can we make MM_CP_DIRTY_ACCT still in charge?  IIUC that won't affect your use
case, something like:

       /* Never apply trick if MM_CP_DIRTY_ACCT not set */
       if (!(cp_flags & MM_CP_DIRTY_ACCT))
           return false;

The thing is that's really what MM_CP_DIRTY_ACCT is about, imho (as its name
shows).  Say, we should start to count on the dirty bit for applying the write
bit only if that flag set.  With above, I think we can drop the pte_uffd_wp()
check below because uffd_wp never applies MM_CP_DIRTY_ACCT when do
change_protection().

I don't think that would work. The anonymous pages that we're
interesting in optimizing are private writable pages, for which
vma_wants_writenotify(vma, vma->vm_page_prot) would return false (and
thus MM_CP_DIRTY_ACCT would not be set, and thus your code would
disable the optimization), because of this code at the top of
vma_wants_writenotify:

        /* If it was private or non-writable, the write bit is already clear */
        if ((vm_flags & (VM_WRITE|VM_SHARED)) != ((VM_WRITE|VM_SHARED)))
                return 0;

IIUC, dirty accountable is about whether we can always apply the
optimization no matter what the ref count is, so it isn't suitable for
situations where we need to check the ref count.

Ah I see.. Though it still looks weird e.g. the first check could have been
done before calling change_protection()?

diff --git a/mm/mprotect.c b/mm/mprotect.c
index 96f4df023439..9270140afbbd 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c

@@ -548,7 +548,8 @@ mprotect_fixup(struct vm_area_struct *vma, struct vm_area_struct **pprev,
         * held in write mode.
         */
        vma->vm_flags = newflags;
-       dirty_accountable = vma_wants_writenotify(vma, vma->vm_page_prot);
+       dirty_accountable = vma_wants_writenotify(vma, vma->vm_page_prot) ||
+           (vma_is_anonymous(vma) && (vma->vm_flags & VM_WRITE));
        vma_set_page_prot(vma);

        change_protection(vma, start, end, vma->vm_page_prot,

Would something like this make the check even faster?

That still doesn't seem like it would work either. I think we need
three kinds of behavior (glossing over a bunch of details):

- always make RW for certain shared pages (this is the original dirty
accountable behavior)
- don't make RW except for page_count==1 for certain private pages
- don't optimize at all in other cases

A single bit isn't enough to cover all of these possibilities.

Yes I guess you're right.

quoted

Meanwhile when I'm looking at the rest I found I cannot understand the other
check in this patch regarding soft dirty:

+       if (!pte_soft_dirty(pte) && (vma->vm_flags & VM_SOFTDIRTY))
+               return false;

I'm wondering why it's not:

+       if (!pte_soft_dirty(pte) && !(vma->vm_flags & VM_SOFTDIRTY))
+               return false;

Then I look back and it's indeed what it does before, starting from commit
64e455079e1b ("mm: softdirty: enable write notifications on VMAs after
VM_SOFTDIRTY cleared", 2014-10-14):

        if (dirty_accountable && pte_dirty(ptent) &&
                (pte_soft_dirty(ptent) ||
                !(vma->vm_flags & VM_SOFTDIRTY)))

However I don't get it... Shouldn't this be "if soft dirty set, or soft dirty
tracking not enabled, then we can grant the write bit"?  The thing is afaiu
VM_SOFTDIRTY works in the reversed way that soft dirty enabled only if it's
cleared.  Hmm... Am I the only one thinks it wrong?

No strong opinions here, I'm just preserving the original logic.

Yeah no reason to block your patch.  I'll think about it.

It's just that the 1st !MM_CP_DIRTY_ACCT check looks very hard to follow,
however indeed I don't have any better idea to rewrite it.

Then the patch looks good to me:

Reviewed-by: Peter Xu <peterx@redhat.com>

Thanks,

-- 
Peter Xu

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help