Re: [PATCH 1/2] KVM: fix cache stale memslot info with correct mmio generation number

[PATCH 1/2] KVM: fix cache stale memslot info with correct mmio generation number · Xiao Guangrong <hidden> · 2014-08-14
[PATCH 2/2] kvm: x86: fix stale mmio cache bug · Xiao Guangrong <hidden> · 2014-08-14
Re: [PATCH 2/2] kvm: x86: fix stale mmio cache bug · David Matlack <dmatlack@google.com> · 2014-08-14
Re: [PATCH 2/2] kvm: x86: fix stale mmio cache bug · Paolo Bonzini <pbonzini@redhat.com> · 2014-08-18
Re: [PATCH 1/2] KVM: fix cache stale memslot info with correct mmio generation number · Xiao Guangrong <hidden> · 2014-08-14
Re: [PATCH 1/2] KVM: fix cache stale memslot info with correct mmio generation number · Paolo Bonzini <pbonzini@redhat.com> · 2014-08-18
Re: [PATCH 1/2] KVM: fix cache stale memslot info with correct mmio generation number · Xiao Guangrong <hidden> · 2014-08-18
Re: [PATCH 1/2] KVM: fix cache stale memslot info with correct mmio generation number · David Matlack <dmatlack@google.com> · 2014-08-18
Re: [PATCH 1/2] KVM: fix cache stale memslot info with correct mmio generation number · Paolo Bonzini <pbonzini@redhat.com> · 2014-08-18
Re: [PATCH 1/2] KVM: fix cache stale memslot info with correct mmio generation number · Xiao Guangrong <hidden> · 2014-08-18
Re: [PATCH 1/2] KVM: fix cache stale memslot info with correct mmio generation number · David Matlack <dmatlack@google.com> · 2014-08-18
Re: [PATCH 1/2] KVM: fix cache stale memslot info with correct mmio generation number · Paolo Bonzini <pbonzini@redhat.com> · 2014-08-18
Re: [PATCH 1/2] KVM: fix cache stale memslot info with correct mmio generation number · David Matlack <dmatlack@google.com> · 2014-08-18
Re: [PATCH 1/2] KVM: fix cache stale memslot info with correct mmio generation number · Xiao Guangrong <hidden> · 2014-08-19
Re: [PATCH 1/2] KVM: fix cache stale memslot info with correct mmio generation number · David Matlack <dmatlack@google.com> · 2014-08-19
Re: [PATCH 1/2] KVM: fix cache stale memslot info with correct mmio generation number · Xiao Guangrong <hidden> · 2014-08-19
Re: [PATCH 1/2] KVM: fix cache stale memslot info with correct mmio generation number · David Matlack <dmatlack@google.com> · 2014-08-19
Re: [PATCH 1/2] KVM: fix cache stale memslot info with correct mmio generation number · Xiao Guangrong <hidden> · 2014-08-19
Re: [PATCH 1/2] KVM: fix cache stale memslot info with correct mmio generation number · David Matlack <dmatlack@google.com> · 2014-08-19
Re: [PATCH 1/2] KVM: fix cache stale memslot info with correct mmio generation number · Xiao Guangrong <hidden> · 2014-08-19
Re: [PATCH 1/2] KVM: fix cache stale memslot info with correct mmio generation number · Paolo Bonzini <pbonzini@redhat.com> · 2014-08-19
Re: [PATCH 1/2] KVM: fix cache stale memslot info with correct mmio generation number · Xiao Guangrong <hidden> · 2014-08-19
Re: [PATCH 1/2] KVM: fix cache stale memslot info with correct mmio generation number · Paolo Bonzini <pbonzini@redhat.com> · 2014-08-19
Re: [PATCH 1/2] KVM: fix cache stale memslot info with correct mmio generation number · Xiao Guangrong <hidden> · 2014-08-20
Re: [PATCH 1/2] KVM: fix cache stale memslot info with correct mmio generation number · David Matlack <dmatlack@google.com> · 2014-08-20
Re: [PATCH 1/2] KVM: fix cache stale memslot info with correct mmio generation number · Paolo Bonzini <pbonzini@redhat.com> · 2014-08-20

From: Paolo Bonzini <pbonzini@redhat.com>
Date: 2014-08-18 18:47:45
Also in: kvm, lkml

Il 18/08/2014 18:35, Xiao Guangrong ha scritto:

Hi Paolo,

Thank you to review the patch!

On Aug 18, 2014, at 9:57 PM, Paolo Bonzini [off-list ref] wrote:

quoted

Il 14/08/2014 09:01, Xiao Guangrong ha scritto:

quoted

-	update_memslots(slots, new, kvm->memslots->generation);
+	/* ensure generation number is always increased. */
+	slots->generation = old_memslots->generation;
+	update_memslots(slots, new);
	rcu_assign_pointer(kvm->memslots, slots);
	synchronize_srcu_expedited(&kvm->srcu);
+	slots->generation++;

I don't trust my brain enough to review this patch.

Sorry to make you confused. I should expain it more clearly.

Don't worry, it's not your fault. :)

quoted

kvm_current_mmio_generation seems like a very bad (race-prone) API.  One
patch I trust myself reviewing would change a bunch of functions in
kvm_main.c to take a memslots struct.  This would make it easy to
respect the hard and fast rule of not dereferencing the same pointer
twice.  But it would be a tedious change.

kvm_set_memory_region is the only place updating memslot and
kvm_current_mmio_generation accesses memslot by rcu-dereference,
i do not know why other places need to take into account.

The race occurs because gfn_to_pfn_many_atomic or some other function
has already used kvm_memslots().  Calling kvm_memslots() twice is the
root cause the bug.

I think this patch is auditable, page-fault is always called by holding
srcu-lock so that a page fault canï¿½t go across synchronize_srcu_expedited.
Only these cases can happen:

1)  page fault occurs before synchronize_srcu_expedited.
    In this case, vcpu will generate mmio-exit for the memslot being registered
    by the ioctl. Thatï¿½s ok since the ioctl have not finished.

2) page fault occurs after synchronize_srcu_expedited and during
   increasing generation-number.
   In this case, userspace may get wrong mmio-exit (that happen if handing
   page-fault is slower that the ioctl), thatï¿½s ok too since userspace need do
  the check anyway like i said above.

3) page fault occurs after generation-number update
   thatï¿½s definitely correct. :)

quoted

Another alternative could be to use the low bit to mark an in-progress
change, and skip the caching if the low bit is set.  Similar to a
seqcount (except if read_seqcount_retry fails, we just punt and not
retry anything), you could use it even though the memory barriers
provided by write_seqcount_begin/end are not too useful in this case.

I do not know how the bit works, page fault will cache the memslot before
the bit set and cache the generation-number after the bit set.

Maybe i missed your idea, could you please detail it?

Something like this:

-	update_memslots(slots, new, kvm->memslots->generation);
+	/* ensure generation number is always increased. */
+	slots->generation = old_memslots->generation + 1;
+	update_memslots(slots, new);
	rcu_assign_pointer(kvm->memslots, slots);
	synchronize_srcu_expedited(&kvm->srcu);
+	slots->generation++;

Then case 1 and 2 will just have a cache miss.

The "low bit" is really just because each slot update does 2 generation
increases.

Paolo

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help