Thread (28 messages) 28 messages, 3 authors, 2021-08-27

Re: [PATCH 05/15] perf: Track guest callbacks on a per-CPU basis

From: Sean Christopherson <seanjc@google.com>
Date: 2021-08-27 15:22:42
Also in: kvm, kvmarm, linux-arm-kernel, linux-riscv, lkml, xen-devel

On Fri, Aug 27, 2021, Peter Zijlstra wrote:
On Fri, Aug 27, 2021 at 02:49:50PM +0000, Sean Christopherson wrote:
quoted
On Fri, Aug 27, 2021, Peter Zijlstra wrote:
quoted
On Thu, Aug 26, 2021 at 05:57:08PM -0700, Sean Christopherson wrote:
quoted
Use a per-CPU pointer to track perf's guest callbacks so that KVM can set
the callbacks more precisely and avoid a lurking NULL pointer dereference.
I'm completely failing to see how per-cpu helps anything here...
It doesn't help until KVM is converted to set the per-cpu pointer in flows that
are protected against preemption, and more specifically when KVM only writes to
the pointer from the owning CPU.  
So the 'problem' I have with this is that sane (!KVM using) people, will
still have to suffer that load, whereas with the static_call() we patch
in an 'xor %rax,%rax' and only have immediate code flow.
Again, I've no objection to the static_call() approach.  I didn't even see the
patch until I had finished testing my series :-/
quoted
Ignoring static call for the moment, I don't see how the unreg side can be safe
using a bare single global pointer.  There is no way for KVM to prevent an NMI
from running in parallel on a different CPU.  If there's a more elegant solution,
especially something that can be backported, e.g. an rcu-protected pointer, I'm
all for it.  I went down the per-cpu path because it allowed for cleanups in KVM,
but similar cleanups can be done without per-cpu perf callbacks.
If all the perf_guest_cbs dereferences are with preemption disabled
(IRQs disabled, IRQ context, NMI context included), then the sequence:

	WRITE_ONCE(perf_guest_cbs, NULL);
	synchronize_rcu();

Ensures that all prior observers of perf_guest_csb will have completed
and future observes must observe the NULL value.
That alone won't be sufficient, as the read side also needs to ensure it doesn't
reload perf_guest_cbs between NULL checks and dereferences.  But that's easy
enough to solve with a READ_ONCE and maybe a helper to make it more cumbersome
to use perf_guest_cbs directly.

How about this for a series?

  1. Use READ_ONCE/WRITE_ONCE + synchronize_rcu() to fix the underlying bug
  2. Fix KVM PT interrupt handler bug
  3. Kill off perf_guest_cbs usage in architectures that don't need the callbacks
  4. Replace ->is_in_guest()/->is_user_mode() with ->state(), and s/get_guest_ip/get_ip
  5. Implement static_call() support
  6. Cleanups, if there are any
  6..N KVM cleanups, e.g. to eliminate current_vcpu and share x86+arm64 callbacks
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help