Re: [PATCH v3 26/35] mm: fall back to mmap_lock if vma->anon_vma is not yet set
From: Suren Baghdasaryan <surenb@google.com>
Date: 2023-02-17 02:15:16
Also in:
linux-arm-kernel, linux-mm, lkml
On Thu, Feb 16, 2023 at 11:43 AM Suren Baghdasaryan [off-list ref] wrote:
On Thu, Feb 16, 2023 at 7:44 AM Matthew Wilcox [off-list ref] wrote:quoted
On Wed, Feb 15, 2023 at 09:17:41PM -0800, Suren Baghdasaryan wrote:quoted
When vma->anon_vma is not set, page fault handler will set it by either reusing anon_vma of an adjacent VMA if VMAs are compatible or by allocating a new one. find_mergeable_anon_vma() walks VMA tree to find a compatible adjacent VMA and that requires not only the faulting VMA to be stable but also the tree structure and other VMAs inside that tree. Therefore locking just the faulting VMA is not enough for this search. Fall back to taking mmap_lock when vma->anon_vma is not set. This situation happens only on the first page fault and should not affect overall performance.I think I asked this before, but don't remember getting an aswer. Why do we defer setting anon_vma to the first fault? Why don't we set it up at mmap time?Yeah, I remember that conversation Matthew and I could not find the definitive answer at the time. I'll look into that again or maybe someone can answer it here.
After looking into it again I'm still under the impression that vma->anon_vma is populated lazily (during the first page fault rather than at mmap time) to avoid doing extra work for areas which are never faulted. Though I might be missing some important detail here.
In the end rather than changing that logic I decided to skip vma->anon_vma==NULL cases because I measured them being less than 0.01% of all page faults, so ROI from changing that would be quite low. But I agree that the logic is weird and maybe we can improve that. I will have to review that again when I'm working on eliminating all these special cases we skip, like swap/userfaults/etc.