Thread (32 messages) 32 messages, 6 authors, 2021-08-10

Re: KVM's support for non default APIC base

From: Maxim Levitsky <hidden>
Date: 2021-08-10 20:42:40
Also in: lkml

On Mon, 2021-08-09 at 09:47 -0700, Jim Mattson wrote:
On Mon, Aug 9, 2021 at 2:40 AM Maxim Levitsky [off-list ref] wrote:
quoted
On Fri, 2021-08-06 at 21:55 +0000, Sean Christopherson wrote:
quoted
On Thu, Jul 22, 2021, Maxim Levitsky wrote:
quoted
On Mon, 2021-07-19 at 18:49 +0000, Sean Christopherson wrote:
quoted
On Sun, Jul 18, 2021, Maxim Levitsky wrote:
-> APIC MMIO area has to be MMIO for 'apic_mmio_write' to be called,
   thus must contain no guest memslots.
   If the guest relocates the APIC base somewhere where we have a memslot,
   memslot will take priority, while on real hardware, LAPIC is likely to
   take priority.
Yep.  The thing that really bites us is that other vCPUs should still be able to
access the memory defined by the memslot, e.g. to make it work we'd have to run
the vCPU with a completely different MMU root.
That is something I haven't took in the account.
Complexity of supporting this indeed isn't worth it.
quoted
quoted
As far as I know the only good reason to relocate APIC base is to access it
from the real mode which is not something that is done these days by modern
BIOSes.

I vote to make it read only (#GP on MSR_IA32_APICBASE write when non default
base is set and apic enabled) and remove all remains of the support for
variable APIC base.
Making up our own behavior is almost never the right approach.  E.g. _best_ case
scenario for an unexpected #GP is the guest immediately terminates.  Worst case
scenario is the guest eats the #GP and continues on, which is basically the status
quo, except it's guaranteed to now work, whereas todays behavior can at least let
the guest function, for some definitions of "function".
Well, at least the Intel's PRM does state that APIC base relocation is not guaranteed
to work on all CPUs, so giving the guest a #GP is like telling it that current CPU doesn't
support it. In theory, a very well behaving guest can catch the exception and
fail back to the default base.

I don't understand what do you mean by 'guaranteed to now work'. If the guest
ignores this #GP and still thinks that APIC base relocation worked, it is its fault.
A well behaving guest should never assume that a msr write that failed with #GP
worked.

quoted
I think the only viable "solution" is to exit to userspace on the guilty WRMSR.
Whether or not we can do that without breaking userspace is probably the big
question.  Fully emulating APIC base relocation would be a tremendous amount of
effort and complexity for practically zero benefit.
I have nothing against this as well although I kind of like the #GP approach a bit more,
and knowing that there are barely any reasons
to relocate the APIC base, and that it doesn't work well, there is a good chance
that no one does it anyway (except our kvm unit tests, but that isn't an issue).
quoted
quoted
(we already have a warning when APIC base is set to non default value)
FWIW, that warning is worthless because it's _once(), i.e. won't help detect a
misbehaving guest unless it's the first guest to misbehave on a particular
instantiation of KVM.   _ratelimited() would improve the situation, but not
completely eliminate the possibility of a misbehaving guest going unnoticed.
Anything else isn't an option becuase it's obviously guest triggerable.
100% agree.

I'll say I would first make it _ratelimited() for few KVM versions, and then
if nobody complains, make it a KVM internal error / #GP, and remove all the leftovers
from the code that pretend that it can work.
Printing things to syslog is not very helpful. Any time that kvm
violates the architectural specification, it should provide
information about the emulation error to userspace.
Paolo, what do you think?

My personal opinion is that we should indeed cause KVM internal error
on all attempts to change the APIC base.

If someone complains, then we can look at their use-case.

My view is that any half-working feature is bound to bitrot
and cause harm and confusion.

Best regards,
	Maxim Levitsky

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help