Thread (44 messages) 44 messages, 4 authors, 2021-03-31

Re: [PATCH 16/18] KVM: Don't take mmu_lock for range invalidation unless necessary

From: Sean Christopherson <seanjc@google.com>
Date: 2021-03-31 16:42:33
Also in: kvm, kvmarm, linux-mips, lkml

On Wed, Mar 31, 2021, Paolo Bonzini wrote:
On 26/03/21 03:19, Sean Christopherson wrote:
quoted
+	/*
+	 * Reset the lock used to prevent memslot updates between MMU notifier
+	 * range_start and range_end.  At this point no more MMU notifiers will
+	 * run, but the lock could still be held if KVM's notifier was removed
+	 * between range_start and range_end.  No threads can be waiting on the
+	 * lock as the last reference on KVM has been dropped.  If the lock is
+	 * still held, freeing memslots will deadlock.
+	 */
+	init_rwsem(&kvm->mmu_notifier_slots_lock);
I was going to say that this is nasty,
Heh, I still think it's nasty.
then I noticed that
mmu_notifier_unregister uses SRCU to ensure completion of concurrent calls
to the MMU notifier.  So I guess it's fine, but it's better to point it out:

	/*
	 * At this point no more MMU notifiers will run and pending
	 * calls to range_start have completed, but the lock would
	 * still be held and never released if the MMU notifier was
	 * removed between range_start and range_end.  Since the last
	 * reference to the struct kvm has been dropped, no threads can
	 * be waiting on the lock, but we might still end up taking it
	 * when freeing memslots in kvm_arch_destroy_vm.  Reset the lock
	 * to avoid deadlocks.
	 */

That said, the easiest way to avoid this would be to always update
mmu_notifier_count.
Updating mmu_notifier_count requires taking mmu_lock, which would defeat the
purpose of these shenanigans.  I think it could be made atomic, since mmu_lock
would be taken before the elevated count _must_ be visible, but that would
break the mmu_notifier_range_{start,end} optimization that was recently added.

Or did I completely misunderstand what you're suggesting?
I don't mind the rwsem, but at least I suggest that you
split the patch in two---the first one keeping the mmu_notifier_count update
unconditional, and the second one introducing the rwsem and the on_lock
function kvm_inc_notifier_count.  Please document the new lock in
Documentation/virt/kvm/locking.rst too.
Note, will update docs.
Also, related to the first part of the series, perhaps you could structure
the series in a slightly different way:

1) introduce the HVA walking API in common code, complete with on_lock and
patch 15, so that you can use on_lock to increase mmu_notifier_seq

2) then migrate all architectures including x86 to the new API

IOW, first half of patch 10 and all of patch 15; then the second half of
patch 10; then patches 11-14.
quoted
+#if defined(CONFIG_MMU_NOTIFIER) && defined(KVM_ARCH_WANT_MMU_NOTIFIER)
+	down_write(&kvm->mmu_notifier_slots_lock);
+#endif
 	rcu_assign_pointer(kvm->memslots[as_id], slots);
+#if defined(CONFIG_MMU_NOTIFIER) && defined(KVM_ARCH_WANT_MMU_NOTIFIER)
+	up_write(&kvm->mmu_notifier_slots_lock);
+#endif
Please do this unconditionally, the cost is minimal if the rwsem is not
contended (as is the case if the architecture doesn't use MMU notifiers at
all).
It's not the cost, it's that mmu_notifier_slots_lock doesn't exist.  That's an
easily solved problem, but then the lock wouldn't be initialized since
kvm_init_mmu_notifier() is a nop.  That's again easy to solve, but IMO would
look rather weird.  I guess the counter argument is that __kvm_memslots()
wouldn't need #ifdeffery.

These are the to ideas I've come up with:

Option 1:
	static int kvm_init_mmu_notifier(struct kvm *kvm)
	{
		init_rwsem(&kvm->mmu_notifier_slots_lock);

	#if defined(CONFIG_MMU_NOTIFIER) && defined(KVM_ARCH_WANT_MMU_NOTIFIER)
		kvm->mmu_notifier.ops = &kvm_mmu_notifier_ops;
		return mmu_notifier_register(&kvm->mmu_notifier, current->mm);
	#else
		return 0;
	#endif
	}


Option 2:
	kvm_mmu_notifier_lock(kvm);
	rcu_assign_pointer(kvm->memslots[as_id], slots);
	kvm_mmu_notifier_unlock(kvm);





_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help