Re: [PATCH mm-unstable v1 5/5] mm: multi-gen LRU: use mmu_notifier_test_clear_young()

[PATCH mm-unstable v1 0/5] mm/kvm: lockless accessed bit harvest · Yu Zhao <hidden> · 2023-02-17
[PATCH mm-unstable v1 1/5] mm/kvm: add mmu_notifier_test_clear_young() · Yu Zhao <hidden> · 2023-02-17
Re: [PATCH mm-unstable v1 1/5] mm/kvm: add mmu_notifier_test_clear_young() · Sean Christopherson <seanjc@google.com> · 2023-02-23
Re: [PATCH mm-unstable v1 1/5] mm/kvm: add mmu_notifier_test_clear_young() · Yu Zhao <hidden> · 2023-02-23
Re: [PATCH mm-unstable v1 1/5] mm/kvm: add mmu_notifier_test_clear_young() · Sean Christopherson <seanjc@google.com> · 2023-02-23
Re: [PATCH mm-unstable v1 1/5] mm/kvm: add mmu_notifier_test_clear_young() · Sean Christopherson <seanjc@google.com> · 2023-02-23
[PATCH mm-unstable v1 2/5] kvm/x86: add kvm_arch_test_clear_young() · Yu Zhao <hidden> · 2023-02-17
Re: [PATCH mm-unstable v1 2/5] kvm/x86: add kvm_arch_test_clear_young() · Yu Zhao <hidden> · 2023-02-17
Re: [PATCH mm-unstable v1 2/5] kvm/x86: add kvm_arch_test_clear_young() · Sean Christopherson <seanjc@google.com> · 2023-02-17
Re: [PATCH mm-unstable v1 2/5] kvm/x86: add kvm_arch_test_clear_young() · Yu Zhao <hidden> · 2023-02-23
Re: [PATCH mm-unstable v1 2/5] kvm/x86: add kvm_arch_test_clear_young() · Sean Christopherson <seanjc@google.com> · 2023-02-23
Re: [PATCH mm-unstable v1 2/5] kvm/x86: add kvm_arch_test_clear_young() · Yu Zhao <hidden> · 2023-02-23
Re: [PATCH mm-unstable v1 2/5] kvm/x86: add kvm_arch_test_clear_young() · Sean Christopherson <seanjc@google.com> · 2023-02-23
Re: [PATCH mm-unstable v1 2/5] kvm/x86: add kvm_arch_test_clear_young() · Yu Zhao <hidden> · 2023-02-23
Re: [PATCH mm-unstable v1 2/5] kvm/x86: add kvm_arch_test_clear_young() · Sean Christopherson <seanjc@google.com> · 2023-02-23
Re: [PATCH mm-unstable v1 2/5] kvm/x86: add kvm_arch_test_clear_young() · Yu Zhao <hidden> · 2023-02-23
Re: [PATCH mm-unstable v1 2/5] kvm/x86: add kvm_arch_test_clear_young() · Sean Christopherson <seanjc@google.com> · 2023-02-23
Re: [PATCH mm-unstable v1 2/5] kvm/x86: add kvm_arch_test_clear_young() · Yu Zhao <hidden> · 2023-02-23
[PATCH mm-unstable v1 3/5] kvm/arm64: add kvm_arch_test_clear_young() · Yu Zhao <hidden> · 2023-02-17
Re: [PATCH mm-unstable v1 3/5] kvm/arm64: add kvm_arch_test_clear_young() · Yu Zhao <hidden> · 2023-02-17
Re: [PATCH mm-unstable v1 3/5] kvm/arm64: add kvm_arch_test_clear_young() · Marc Zyngier <maz@kernel.org> · 2023-02-17
Re: [PATCH mm-unstable v1 3/5] kvm/arm64: add kvm_arch_test_clear_young() · Yu Zhao <hidden> · 2023-02-23
Re: [PATCH mm-unstable v1 3/5] kvm/arm64: add kvm_arch_test_clear_young() · Marc Zyngier <maz@kernel.org> · 2023-02-23
Re: [PATCH mm-unstable v1 3/5] kvm/arm64: add kvm_arch_test_clear_young() · Yu Zhao <hidden> · 2023-02-23
Re: [PATCH mm-unstable v1 3/5] kvm/arm64: add kvm_arch_test_clear_young() · Oliver Upton <hidden> · 2023-02-17
Re: [PATCH mm-unstable v1 3/5] kvm/arm64: add kvm_arch_test_clear_young() · Sean Christopherson <seanjc@google.com> · 2023-02-17
Re: [PATCH mm-unstable v1 3/5] kvm/arm64: add kvm_arch_test_clear_young() · Yu Zhao <hidden> · 2023-02-23
Re: [PATCH mm-unstable v1 3/5] kvm/arm64: add kvm_arch_test_clear_young() · Yu Zhao <hidden> · 2023-02-23
[PATCH mm-unstable v1 4/5] kvm/powerpc: add kvm_arch_test_clear_young() · Yu Zhao <hidden> · 2023-02-17
Re: [PATCH mm-unstable v1 4/5] kvm/powerpc: add kvm_arch_test_clear_young() · Yu Zhao <hidden> · 2023-02-17
[PATCH mm-unstable v1 5/5] mm: multi-gen LRU: use mmu_notifier_test_clear_young() · Yu Zhao <hidden> · 2023-02-17
Re: [PATCH mm-unstable v1 5/5] mm: multi-gen LRU: use mmu_notifier_test_clear_young() · Sean Christopherson <seanjc@google.com> · 2023-02-23
Re: [PATCH mm-unstable v1 5/5] mm: multi-gen LRU: use mmu_notifier_test_clear_young() · Yu Zhao <hidden> · 2023-02-23
Re: [PATCH mm-unstable v1 5/5] mm: multi-gen LRU: use mmu_notifier_test_clear_young() · Sean Christopherson <seanjc@google.com> · 2023-02-23
Re: [PATCH mm-unstable v1 5/5] mm: multi-gen LRU: use mmu_notifier_test_clear_young() · Yu Zhao <hidden> · 2023-02-23
Re: [PATCH mm-unstable v1 5/5] mm: multi-gen LRU: use mmu_notifier_test_clear_young() · Sean Christopherson <seanjc@google.com> · 2023-02-23
Re: [PATCH mm-unstable v1 5/5] mm: multi-gen LRU: use mmu_notifier_test_clear_young() · Yu Zhao <hidden> · 2023-02-23
Re: [PATCH mm-unstable v1 5/5] mm: multi-gen LRU: use mmu_notifier_test_clear_young() · Sean Christopherson <seanjc@google.com> · 2023-02-23
Re: [PATCH mm-unstable v1 5/5] mm: multi-gen LRU: use mmu_notifier_test_clear_young() · Yu Zhao <hidden> · 2023-02-23

From: Yu Zhao <hidden>
Date: 2023-02-23 20:10:19
Also in: kvm, kvmarm, linux-arm-kernel, linux-mm, lkml

On Thu, Feb 23, 2023 at 12:58 PM Sean Christopherson [off-list ref] wrote:

On Thu, Feb 23, 2023, Yu Zhao wrote:

quoted

On Thu, Feb 23, 2023 at 12:11 PM Sean Christopherson [off-list ref] wrote:

quoted

On Thu, Feb 23, 2023, Yu Zhao wrote:

quoted

As alluded to in patch 1, unless batching the walks even if KVM does _not_ support
a lockless walk is somehow _worse_ than using the existing mmu_notifier_clear_flush_young(),
I think batching the calls should be conditional only on LRU_GEN_SPTE_WALK.  Or
if we want to avoid batching when there are no mmu_notifier listeners, probe
mmu_notifiers.  But don't call into KVM directly.

I'm not sure I fully understand. Let's present the problem on the MM
side: assuming KVM supports lockless walks, batching can still be
worse (very unlikely), because GFNs can exhibit no memory locality at
all. So this option allows userspace to disable batching.

I'm asking the opposite.  Is there a scenario where batching+lock is worse than
!batching+lock?  If not, then don't make batching depend on lockless walks.

Yes, absolutely. batching+lock means we take/release mmu_lock for
every single PTE in the entire VA space -- each small batch contains
64 PTEs but the entire batch is the whole KVM.

Who is "we"?

Oops -- shouldn't have used "we".

I don't see anything in the kernel that triggers walking the whole
VMA, e.g. lru_gen_look_around() limits the walk to a single PMD.  I feel like I'm
missing something...

walk_mm() -> walk_pud_range() -> walk_pmd_range() -> walk_pte_range()
-> test_spte_young() -> mmu_notifier_test_clear_young().

MGLRU takes two passes: during the first pass, it sweeps entire VA
space on each MM (per MM/KVM); during the second pass, it uses the rmap on each
folio (per folio). The look around exploits the (spatial) locality in
the second pass, to get the best out of the expensive per folio rmap
walk.

(The first pass can't handle shared mappings; the second pass can.)

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help