Thread (39 messages) 39 messages, 4 authors, 2023-02-23

Re: [PATCH mm-unstable v1 3/5] kvm/arm64: add kvm_arch_test_clear_young()

From: Yu Zhao <hidden>
Date: 2023-02-23 05:26:25
Also in: kvm, kvmarm, linux-arm-kernel, linux-mm, lkml

On Fri, Feb 17, 2023 at 9:00 AM Sean Christopherson [off-list ref] wrote:
On Fri, Feb 17, 2023, Oliver Upton wrote:
quoted
Hi Yu,

scripts/get_maintainers.pl is your friend for getting the right set of
emails for a series :) Don't know about others, but generally I would
prefer to be Cc'ed on an entire series (to gather context) than just an
individual patch.
+1
quoted
On Thu, Feb 16, 2023 at 09:12:28PM -0700, Yu Zhao wrote:
quoted
This patch adds kvm_arch_test_clear_young() for the vast majority of
VMs that are not pKVM and run on hardware that sets the accessed bit
in KVM page tables.
At least for the x86 changes, please read Documentation/process/maintainer-tip.rst
and rewrite the changelogs.
I see -- will remove "this patch".
quoted
quoted
It relies on two techniques, RCU and cmpxchg, to safely test and clear
the accessed bit without taking the MMU lock. The former protects KVM
page tables from being freed while the latter clears the accessed bit
atomically against both the hardware and other software page table
walkers.

Signed-off-by: Yu Zhao <redacted>
---
 arch/arm64/include/asm/kvm_host.h       |  7 +++
 arch/arm64/include/asm/kvm_pgtable.h    |  8 +++
 arch/arm64/include/asm/stage2_pgtable.h | 43 ++++++++++++++
 arch/arm64/kvm/arm.c                    |  1 +
 arch/arm64/kvm/hyp/pgtable.c            | 51 ++--------------
 arch/arm64/kvm/mmu.c                    | 77 ++++++++++++++++++++++++-
 6 files changed, 141 insertions(+), 46 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 35a159d131b5..572bcd321586 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -1031,4 +1031,11 @@ static inline void kvm_hyp_reserve(void) { }
 void kvm_arm_vcpu_power_off(struct kvm_vcpu *vcpu);
 bool kvm_arm_vcpu_stopped(struct kvm_vcpu *vcpu);

+/* see the comments on the generic kvm_arch_has_test_clear_young() */
Please eliminate all of these "see the comments on blah", in every case they do
nothing more than redirect the reader to something they're likely already aware of.
quoted
quoted
+#define kvm_arch_has_test_clear_young kvm_arch_has_test_clear_young
+static inline bool kvm_arch_has_test_clear_young(void)
+{
+   return IS_ENABLED(CONFIG_KVM) && cpu_has_hw_af() && !is_protected_kvm_enabled();
+}
...
quoted
Also, I'm at a loss for why we'd need to test if CONFIG_KVM is enabled.
My expectation is that we should provide an implementation that returns
false if !CONFIG_KVM, avoiding the need to repeat that bit in every
single implementation of the function.
mm/vmscan.c uses kvm_arch_has_test_clear_young().  I have opinions on that, but
I'll hold off on expressing them until there's actual justification presented
somewhere.

Yu, this series and each patch needs a big pile of "why".  I get that the goal
is to optimize memory oversubscribe, but there needs to be justification for
why this is KVM only, why nested VMs and !A/D hardware are out of scope, why yet
another mmu_notifier hook is being added, etc.
I totally agree.

This is an optimization, not a bug fix. It can't be justified without
performance numbers from some common use cases. That burden of proof
clearly rests on me -- I will follow up on that.

For now, I want to make sure the methodical part is clear:
1. We only have limited resources and we need to prioritize major use cases.
2. We can only improve one thing at a time and we can't cover
everything at the same time.
3. We need to focus on the return on investment and the future.

I hope everyone by now agrees with my "the vast majority of VMs ..."
assertion. If not, I'm happy to revisit that [1]. If so, the next step
would be whether we want to focus on the vast majority first. I think
this naturally answers why the nested VMs and !AD h/w are out of
scope, at the moment (I didn't spell this out; probably I should in
v2). After we have taken the first step, we probably can decide
whether there is enough resource and demand to cover the low return on
investment part (but complexity but less common use cases).

[1] https://lore.kernel.org/linux-mm/20230217041230.2417228-1-yuzhao@google.com/ (local)
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help