Thread (6 messages) 6 messages, 2 authors, 2020-08-28

Re: [PATCH] arm64/relocate_kernel: remove redundant but misleading code

From: Pingfan Liu <hidden>
Date: 2020-08-07 14:16:31

Hi Morse,

Appreciate for your patient explanation. I have no experience of
arm/kvm and after a quick through, I still have some questions. Please
correct me if I am wrong.

On Thu, Aug 6, 2020 at 8:20 PM James Morse [off-list ref] wrote:
Hi Liu,

On 06/08/2020 09:26, Pingfan Liu wrote:
quoted
The kexec switch sequence looks like the following:
  SYM_CODE_START(__cpu_soft_restart)
    ...
    pre_disable_mmu_workaround
    msr       sctlr_el1, x12
    ...
    br        x8

  SYM_CODE_START(arm64_relocate_new_kernel)
    ...
    pre_disable_mmu_workaround
    msr       sctlr_el2, x0
    ...
quoted
"msr sctlr_el2, x0" is misleading, because "br x8" jump to a physical
address, which has no entry in idmap.
Even better: this code run from a copy allocated by kexec, its not in the idmap either.
Yes, looks better.
See the memcpy() in machine_kexec().

quoted
It implies that MMU has already been fully off after "msr sctlr_el1, x12".
quoted
And according to "D12.2.101 SCTLR_EL2, System Control Register (EL2)" in
"ARM Architecture Reference Manual", actually, EL2&0 host accesses
to SCTLR_EL2 when using mnemonic SCTLR_EL1.
Only when HCR_EL2.E2H is enabled. If linux booted at EL2 on a non-VHE system, SCTLR_EL1
and SCTLR_EL2 are different registers, both of which are managed by linux/KVM.
Yeah. I have realized these factors when composing this patch, but not
sure anything is missing.

Kexec on arm is introduced by the following two commits
d28f6df arm64/kexec: Add core kexec support
f9076ec arm64: Add back cpu reset routines

And at d28f6df, the code snippet is
in kernel/cpu-reset.S,
ENTRY(__cpu_soft_restart)
        /* Clear sctlr_el1 flags. */
        mrs     x12, sctlr_el1
        ldr     x13, =SCTLR_ELx_FLAGS
        bic     x12, x12, x13
        msr     sctlr_el1, x12
        isb

        cbz     x0, 1f                          // el2_switch?
        mov     x0, #HVC_SOFT_RESTART
        hvc     #0                              // no return
        //--> as the note, I think for both EL1&0 host and guest, they
will never resume the following code
        //--> so in the original patch, the rest code is only for EL2&0 host

1:      mov     x18, x1                         // entry
        mov     x0, x2                          // arg0
        mov     x1, x3                          // arg1
        mov     x2, x4                          // arg2
        br      x18
ENDPROC(__cpu_soft_restart)

And in kernel/hyp-stub.S
el1_sync:
        mrs     x30, esr_el2
        lsr     x30, x30, #ESR_ELx_EC_SHIFT

        cmp     x30, #ESR_ELx_EC_HVC64
        b.ne    9f                              // Not an HVC trap

        cmp     x0, #HVC_GET_VECTORS
        b.ne    1f
        mrs     x0, vbar_el2
        b       9f

1:      cmp     x0, #HVC_SET_VECTORS
        b.ne    2f
        msr     vbar_el2, x1
        b       9f

2:      cmp     x0, #HVC_SOFT_RESTART
        b.ne    3f
        mov     x0, x2
        mov     x2, x4
        mov     x4, x1
        mov     x1, x3
        br      x4                              // no return

        /* Someone called kvm_call_hyp() against the hyp-stub... */
3:      mov     x0, #ARM_EXCEPTION_HYP_GONE

9:      eret
ENDPROC(el1_sync)

I think for a soft restart, it jumps to the new kernel by 'br x4'. But
it seems a bug, which does not clear I+C bits, not disable EL2
translation regime.

On the other hand, if it wants to resume execution immediately after
"hvc #0" in ENTRY(__cpu_soft_restart), it should eret by the
modification of "spsr_el2.M[3:0] = PSR_MODE_EL2h", the code originates
from EL1, but the rest code tries to modify EL2 registers.

By this way, the code can do the exact things you mentioned. But there
seems to be a missing of such mechanisms.
quoted
Hence removing the redundant but misleading code.
This isn't the reason its redundant...

quoted
diff --git a/arch/arm64/kernel/cpu-reset.S b/arch/arm64/kernel/cpu-reset.S
index 4a18055..37721eb 100644
--- a/arch/arm64/kernel/cpu-reset.S
+++ b/arch/arm64/kernel/cpu-reset.S
@@ -35,6 +35,10 @@ SYM_CODE_START(__cpu_soft_restart)
      mov_q   x13, SCTLR_ELx_FLAGS
      bic     x12, x12, x13
      pre_disable_mmu_workaround
+     /*
+      * either disable EL1&0 translation regime or disable EL2&0 translation
+      * regime if HCR_EL2.E2H == 1
+      */>    msr     sctlr_el1, x12
      isb
On a VHE system, yes the cpu-reset.S disables EL2&0 by writing to SCTLR_EL1

But on a non-VHE system, that same code disabled EL1&0. cup-reset.S goes on to call
HVC_SOFT_RESTART for EL2, which may be serviced by KVM or the hyp-stub. (or maybe
something else that implements the hyp-stub api)

For kexec, on non-VHE both EL1&0 and EL2 get disabled.
Yes. I see the latest code in SYM_CODE_START(__kvm_handle_stub_hvc)
does the things exactly as what you said.
quoted
diff --git a/arch/arm64/kernel/relocate_kernel.S b/arch/arm64/kernel/relocate_kernel.S
index 542d6ed..84eec95 100644
--- a/arch/arm64/kernel/relocate_kernel.S
+++ b/arch/arm64/kernel/relocate_kernel.S
@@ -36,18 +36,6 @@ SYM_CODE_START(arm64_relocate_new_kernel)
      mov     x14, xzr                        /* x14 = entry ptr */
      mov     x13, xzr                        /* x13 = copy dest */

-     /* Clear the sctlr_el2 flags. */
-     mrs     x0, CurrentEL
-     cmp     x0, #CurrentEL_EL2
-     b.ne    1f
-     mrs     x0, sctlr_el2
-     mov_q   x1, SCTLR_ELx_FLAGS
-     bic     x0, x0, x1
-     pre_disable_mmu_workaround
-     msr     sctlr_el2, x0
-     isb
-1:
I agree this doesn't disable the MMU anymore.

This was originally kept to disable the I+C bits when Kdump interrupted KVM, but since KVM
formalised the hyp-stub API, and has this exact sequence to back its HVC_SOFT_RESTART, it
was only needed for the hyp-stub itself, which has no clue about these SCTLR_EL2 bits.

HVC_SOFT_RESTART only says it needs to disable the MMU. See
Documentation/virt/kvm/arm/hyp-abi.rst


I think its fine to remove this, but the reason is because el2_setup doesn't set those
bits, and KVM clears them when its unloaded, or has a HVC_SOFT_RESTART call.

It might be worth updating the document, but we'd need to check the guarantee is the same
on 32bit. I assume there is no out-of-tree user of the hyp-stub abi.
I have no idea about 32bit. Could you enlighten me?

Again, I am a fresh man on arm/kvm. Please forgive me if I make any
silly mistakes.

Thanks for your time.

Pingfan

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help