Thread (68 messages) 68 messages, 7 authors, 2026-04-21

Re: [PATCH 14/30] KVM: arm64: Hook up reclaim hypercall to pkvm_pgtable_stage2_destroy()

From: Quentin Perret <hidden>
Date: 2026-01-09 14:58:03
Also in: kvmarm

On Friday 09 Jan 2026 at 14:35:57 (+0000), Will Deacon wrote:
On Tue, Jan 06, 2026 at 02:59:19PM +0000, Quentin Perret wrote:
quoted
On Monday 05 Jan 2026 at 15:49:22 (+0000), Will Deacon wrote:
quoted
During teardown of a protected guest, its memory pages must be reclaimed
from the hypervisor by issuing the '__pkvm_reclaim_dying_guest_page'
hypercall.

Add a new helper, __pkvm_pgtable_stage2_reclaim(), which is called
during the VM teardown operation to reclaim pages from the hypervisor
and drop the GUP pin on the host.

Signed-off-by: Will Deacon <will@kernel.org>
---
 arch/arm64/kvm/pkvm.c | 31 ++++++++++++++++++++++++++++++-
 1 file changed, 30 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/kvm/pkvm.c b/arch/arm64/kvm/pkvm.c
index 1814e17d600e..8be91051699e 100644
--- a/arch/arm64/kvm/pkvm.c
+++ b/arch/arm64/kvm/pkvm.c
@@ -322,6 +322,32 @@ int pkvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm_s2_mmu *mmu,
 	return 0;
 }
 
+static int __pkvm_pgtable_stage2_reclaim(struct kvm_pgtable *pgt, u64 start, u64 end)
+{
+	struct kvm *kvm = kvm_s2_mmu_to_kvm(pgt->mmu);
+	pkvm_handle_t handle = kvm->arch.pkvm.handle;
+	struct pkvm_mapping *mapping;
+	int ret;
+
+	for_each_mapping_in_range_safe(pgt, start, end, mapping) {
+		struct page *page;
+
+		ret = kvm_call_hyp_nvhe(__pkvm_reclaim_dying_guest_page,
+					handle, mapping->gfn);
+		if (WARN_ON(ret))
+			return ret;
+
+		page = pfn_to_page(mapping->pfn);
+		WARN_ON_ONCE(mapping->nr_pages != 1);
+		unpin_user_pages_dirty_lock(&page, 1, true);
+		account_locked_vm(current->mm, 1, false);
+		pkvm_mapping_remove(mapping, &pgt->pkvm_mappings);
+		kfree(mapping);
Nit: this might take a while, worth adding a cond_resched() in here?
Commit 4ddfab5436b6 ("KVM: arm64: Reschedule as needed when destroying
the stage-2 page-tables") breaks the destruction operation up into
chunks so we now reschedule from the caller in stage2_destroy_range().

Do you think we need an additional reschedule?
I would intuitively think so, but I've also just realized we already
don't have that resched for np guests, so this patch is absolutely fine
as-is.

It's just that knocking down a whole 1G (assuming 4K pages) will be
significantly faster for normal KVM than for pKVM. It's one call in
pgtable.c with possibly a single deferred TLBI vs 1 hcall and 1 TLBI per
4K page. But I guess this is really performance tuning, so this can
always come later. Feel free to ignore :)
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help