Thread (78 messages) 78 messages, 4 authors, 2020-01-15

Re: [PATCH v2 09/18] arm64: KVM: enable conditional save/restore full SPE profiling buffer controls

From: Marc Zyngier <maz@kernel.org>
Date: 2020-01-08 12:36:16
Also in: kvm, kvmarm, lkml

On 2020-01-08 11:58, Will Deacon wrote:
On Wed, Jan 08, 2020 at 11:17:16AM +0000, Marc Zyngier wrote:
quoted
On 2020-01-07 15:13, Andrew Murray wrote:
quoted
On Sat, Dec 21, 2019 at 02:13:25PM +0000, Marc Zyngier wrote:
quoted
On Fri, 20 Dec 2019 14:30:16 +0000
Andrew Murray [off-list ref] wrote:

[somehow managed not to do a reply all, re-sending]
quoted
From: Sudeep Holla <redacted>

Now that we can save/restore the full SPE controls, we can enable it
if SPE is setup and ready to use in KVM. It's supported in KVM only if
all the CPUs in the system supports SPE.

However to support heterogenous systems, we need to move the check if
host supports SPE and do a partial save/restore.
No. Let's just not go down that path. For now, KVM on heterogeneous
systems do not get SPE.
At present these patches only offer the SPE feature to VCPU's where the
sanitised AA64DFR0 register indicates that all CPUs have this support
(kvm_arm_support_spe_v1) at the time of setting the attribute
(KVM_SET_DEVICE_ATTR).

Therefore if a new CPU comes online without SPE support, and an
existing VCPU is scheduled onto it, then bad things happen - which I
guess
must have been the intention behind this patch.
I guess that was the intent.
quoted
quoted
If SPE has been enabled on a guest and a CPU
comes up without SPE, this CPU should fail to boot (same as exposing a
feature to userspace).
I'm unclear as how to prevent this. We can set the FTR_STRICT flag on
the sanitised register - thus tainting the kernel if such a non-SPE CPU
comes online - thought that doesn't prevent KVM from blowing up. Though
I don't believe we can prevent a CPU coming up. At the moment this is
my preferred approach.
I'd be OK with this as a stop-gap measure. Do we know of any existing
design where only half of the CPUs have SPE?
No, but given how few CPUs implement SPE I'd say that this 
configuration
is inevitable. I certainly went out of my way to support it in the 
driver.
quoted
quoted
Looking at the vcpu_load and related code, I don't see a way of saying
'don't schedule this VCPU on this CPU' or bailing in any way.
That would actually be pretty easy to implement. In vcpu_load(), check
that that the CPU physical has SPE. If not, raise a request for that 
vcpu.
In the run loop, check for that request and abort if raised, returning
to userspace.

Userspace can always check /sys/devices/arm_spe_0/cpumask and work out
where to run that particular vcpu.
It's also worth considering systems where there are multiple 
implementations
of SPE in play. Assuming we don't want to expose this to a guest, then 
the
right interface here is probably for userspace to pick one SPE
implementation and expose that to the guest. That fits with your idea 
above,
where you basically get an immediate exit if we try to schedule a vCPU 
onto
a CPU that isn't part of the SPE mask.
Then it means that the VM should be configured with a mask indicating
which CPUs it is intended to run on, and setting such a mask is 
mandatory
for SPE.
quoted
quoted
One solution could be to allow scheduling onto non-SPE VCPUs but wrap
the
SPE save/restore code in a macro (much like kvm_arm_spe_v1_ready) that
reads the non-sanitised feature register. Therefore we don't go bang,
but
we also increase the size of any black-holes in SPE capturing. Though
this
feels like something that will cause grief down the line.

Is there something else that can be done?
How does userspace deal with this? When SPE is only available on half 
of
the CPUs, how does perf work in these conditions?
Not sure about userspace, but the kernel driver works by instantiating 
an
SPE PMU instance only for the CPUs that have it and then that instance
profiles for only those CPUs. You also need to do something similar if
you had two CPU types with SPE, since the SPE configuration is likely 
to be
different between them.
So that's closer to what Andrew was suggesting above (running a guest on 
a
non-SPE CPU creates a profiling black hole). Except that we can't really
run a SPE-enabled guest on a non-SPE CPU, as the SPE sysregs will UNDEF
at EL1.

Conclusion: we need a mix of a cpumask to indicate which CPUs we want to
run on (generic, not-SPE related), and a check for SPE-capable CPUs.
If any of these condition is not satisfied, the vcpu exits for userspace
to sort out the affinity.

I hate heterogeneous systems.

         M.
-- 
Jazz is not dead. It just smells funny...

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help