Re: [PATCH v6 32/48] KVM: PPC: Book3S HV P9: Read machine check registers while MSR[RI] is 0
From: Nicholas Piggin <npiggin@gmail.com>
Date: 2021-04-10 00:39:47
Excerpts from Alexey Kardashevskiy's message of April 9, 2021 6:55 pm:
On 05/04/2021 11:19, Nicholas Piggin wrote:quoted
SRR0/1, DAR, DSISR must all be protected from machine check which can clobber them. Ensure MSR[RI] is clear while they are live. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> --- arch/powerpc/kvm/book3s_hv.c | 11 +++++++-- arch/powerpc/kvm/book3s_hv_interrupt.c | 33 +++++++++++++++++++++++--- arch/powerpc/kvm/book3s_hv_ras.c | 2 ++ 3 files changed, 41 insertions(+), 5 deletions(-)diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index d6eecedaa5a5..5f0ac6567a06 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c@@ -3567,11 +3567,16 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit, mtspr(SPRN_BESCR, vcpu->arch.bescr); mtspr(SPRN_WORT, vcpu->arch.wort); mtspr(SPRN_TIDR, vcpu->arch.tid); - mtspr(SPRN_DAR, vcpu->arch.shregs.dar); - mtspr(SPRN_DSISR, vcpu->arch.shregs.dsisr); mtspr(SPRN_AMR, vcpu->arch.amr); mtspr(SPRN_UAMOR, vcpu->arch.uamor); + /* + * DAR, DSISR, and for nested HV, SPRGs must be set with MSR[RI] + * clear (or hstate set appropriately to catch those registers + * being clobbered if we take a MCE or SRESET), so those are done + * later. + */ + if (!(vcpu->arch.ctrl & 1)) mtspr(SPRN_CTRLT, mfspr(SPRN_CTRLF) & ~1);@@ -3614,6 +3619,8 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit, hvregs.vcpu_token = vcpu->vcpu_id; } hvregs.hdec_expiry = time_limit; + mtspr(SPRN_DAR, vcpu->arch.shregs.dar); + mtspr(SPRN_DSISR, vcpu->arch.shregs.dsisr); trap = plpar_hcall_norets(H_ENTER_NESTED, __pa(&hvregs), __pa(&vcpu->arch.regs)); kvmhv_restore_hv_return_state(vcpu, &hvregs);diff --git a/arch/powerpc/kvm/book3s_hv_interrupt.c b/arch/powerpc/kvm/book3s_hv_interrupt.c index 6fdd93936e16..e93d2a6456ff 100644 --- a/arch/powerpc/kvm/book3s_hv_interrupt.c +++ b/arch/powerpc/kvm/book3s_hv_interrupt.c@@ -132,6 +132,7 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc s64 hdec; u64 tb, purr, spurr; u64 *exsave; + bool ri_set; unsigned long msr = mfmsr(); int trap; unsigned long host_hfscr = mfspr(SPRN_HFSCR);@@ -203,9 +204,6 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc */ mtspr(SPRN_HDEC, hdec); - mtspr(SPRN_SRR0, vcpu->arch.shregs.srr0); - mtspr(SPRN_SRR1, vcpu->arch.shregs.srr1); - start_timing(vcpu, &vcpu->arch.rm_entry); vcpu->arch.ceded = 0;@@ -231,6 +229,13 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc */ mtspr(SPRN_HDSISR, HDSISR_CANARY); + __mtmsrd(0, 1); /* clear RI */ + + mtspr(SPRN_DAR, vcpu->arch.shregs.dar); + mtspr(SPRN_DSISR, vcpu->arch.shregs.dsisr); + mtspr(SPRN_SRR0, vcpu->arch.shregs.srr0); + mtspr(SPRN_SRR1, vcpu->arch.shregs.srr1); + accumulate_time(vcpu, &vcpu->arch.guest_time); local_paca->kvm_hstate.in_guest = KVM_GUEST_MODE_GUEST_HV_FAST;@@ -248,7 +253,13 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc /* 0x2 bit for HSRR is only used by PR and P7/8 HV paths, clear it */ trap = local_paca->kvm_hstate.scratch0 & ~0x2; + + /* HSRR interrupts leave MSR[RI] unchanged, SRR interrupts clear it. */ + ri_set = false; if (likely(trap > BOOK3S_INTERRUPT_MACHINE_CHECK)) { + if (trap != BOOK3S_INTERRUPT_SYSCALL && + (vcpu->arch.shregs.msr & MSR_RI)) + ri_set = true; exsave = local_paca->exgen; } else if (trap == BOOK3S_INTERRUPT_SYSTEM_RESET) { exsave = local_paca->exnmi;@@ -258,6 +269,22 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc vcpu->arch.regs.gpr[1] = local_paca->kvm_hstate.scratch1; vcpu->arch.regs.gpr[3] = local_paca->kvm_hstate.scratch2; + + /* + * Only set RI after reading machine check regs (DAR, DSISR, SRR0/1) + * and hstate scratch (which we need to move into exsave to make + * re-entrant vs SRESET/MCE) + */ + if (ri_set) { + if (unlikely(!(mfmsr() & MSR_RI))) { + __mtmsrd(MSR_RI, 1); + WARN_ON_ONCE(1); + } + } else { + WARN_ON_ONCE(mfmsr() & MSR_RI); + __mtmsrd(MSR_RI, 1); + } + vcpu->arch.regs.gpr[9] = exsave[EX_R9/sizeof(u64)]; vcpu->arch.regs.gpr[10] = exsave[EX_R10/sizeof(u64)]; vcpu->arch.regs.gpr[11] = exsave[EX_R11/sizeof(u64)];diff --git a/arch/powerpc/kvm/book3s_hv_ras.c b/arch/powerpc/kvm/book3s_hv_ras.c index d4bca93b79f6..8d8a4d5f0b55 100644 --- a/arch/powerpc/kvm/book3s_hv_ras.c +++ b/arch/powerpc/kvm/book3s_hv_ras.c@@ -199,6 +199,8 @@ static void kvmppc_tb_resync_done(void) * know about the exact state of the TB value. Resync TB call will * restore TB to host timebase. * + * This could use the new OPAL_HANDLE_HMI2 to avoid resyncing TB every time.Educating myself - is it because OPAL_HANDLE_HMI2 tells if it is TB/TOD which is the problem so we can avoid calling opal_resync_timebase() if it is not TB?
Yes.
OPAL_HANDLE_HMI2 does not seem to resync TB itself. The comment just does not seem related to the rest of the patch.
Yeah it's not related, I'll take it out.
Otherwise, looks good. Reviewed-by: Alexey Kardashevskiy <redacted>
Thanks, Nick