Re: [PATCH] arm64/relocate_kernel: remove redundant but misleading code
From: Pingfan Liu <hidden>
Date: 2020-08-12 14:23:04
Hi Morse,
I have read the arm64/kvm code, and been more clear about it now.
What do you think about the following commit log (just describe the fact)
arm64/relocate_kernel: remove redundant code
The kexec switch sequence looks like the following:
SYM_CODE_START(__cpu_soft_restart)
/* Clear sctlr_el1 flags. */
mrs x12, sctlr_el1
mov_q x13, SCTLR_ELx_FLAGS
bic x12, x12, x13
pre_disable_mmu_workaround
msr sctlr_el1, x12
isb
cbz x0, 1f // el2_switch?
mov x0, #HVC_SOFT_RESTART
hvc #0 // no return
1: mov x8, x1 // entry
mov x0, x2 // arg0
mov x1, x3 // arg1
mov x2, x4 // arg2
br x8
SYM_CODE_END(__cpu_soft_restart)
SYM_CODE_START(arm64_relocate_new_kernel)
...
pre_disable_mmu_workaround
msr sctlr_el2, x0
...
As for the shutdown of MMU and clearing of I+C bits, three cases should be
considered:
-1. Guest
"msr sctlr_el1, x12" is enough to turn off EL1&0 translation regime and
clear I+C bits.
-2. EL2&0 host
According to "D12.2.101 SCTLR_EL2, System Control Register (EL2)" in
"ARM Architecture Reference Manual", actually, EL2&0 host accesses
to SCTLR_EL2 when using mnemonic SCTLR_EL1.
So "msr sctlr_el1, x12" is enough to turn off MMU and clear I+C bits.
-3. EL1&0 host,
"msr sctlr_el1, x12" turns off EL1&0 translation regime. As for EL2 regime,
el2_setup doesn't turn on EL2 regime and set those bits , and KVM clears
them when it's unloaded, or has a HVC_SOFT_RESTART call.
As a conclusion, the shutdown of MMU and clearing I+C bits in
SYM_CODE_START(arm64_relocate_new_kernel) is redundant.
Thanks,
Pingfan
On Thu, Aug 6, 2020 at 8:20 PM James Morse [off-list ref] wrote:Hi Liu, On 06/08/2020 09:26, Pingfan Liu wrote:quoted
The kexec switch sequence looks like the following: SYM_CODE_START(__cpu_soft_restart) ... pre_disable_mmu_workaround msr sctlr_el1, x12 ... br x8 SYM_CODE_START(arm64_relocate_new_kernel) ... pre_disable_mmu_workaround msr sctlr_el2, x0 ...quoted
"msr sctlr_el2, x0" is misleading, because "br x8" jump to a physical address, which has no entry in idmap.Even better: this code run from a copy allocated by kexec, its not in the idmap either. See the memcpy() in machine_kexec().quoted
It implies that MMU has already been fully off after "msr sctlr_el1, x12".quoted
And according to "D12.2.101 SCTLR_EL2, System Control Register (EL2)" in "ARM Architecture Reference Manual", actually, EL2&0 host accesses to SCTLR_EL2 when using mnemonic SCTLR_EL1.Only when HCR_EL2.E2H is enabled. If linux booted at EL2 on a non-VHE system, SCTLR_EL1 and SCTLR_EL2 are different registers, both of which are managed by linux/KVM.quoted
Hence removing the redundant but misleading code.This isn't the reason its redundant...quoted
diff --git a/arch/arm64/kernel/cpu-reset.S b/arch/arm64/kernel/cpu-reset.S index 4a18055..37721eb 100644 --- a/arch/arm64/kernel/cpu-reset.S +++ b/arch/arm64/kernel/cpu-reset.S@@ -35,6 +35,10 @@ SYM_CODE_START(__cpu_soft_restart) mov_q x13, SCTLR_ELx_FLAGS bic x12, x12, x13 pre_disable_mmu_workaround + /* + * either disable EL1&0 translation regime or disable EL2&0 translation + * regime if HCR_EL2.E2H == 1 + */> msr sctlr_el1, x12 isbOn a VHE system, yes the cpu-reset.S disables EL2&0 by writing to SCTLR_EL1 But on a non-VHE system, that same code disabled EL1&0. cup-reset.S goes on to call HVC_SOFT_RESTART for EL2, which may be serviced by KVM or the hyp-stub. (or maybe something else that implements the hyp-stub api) For kexec, on non-VHE both EL1&0 and EL2 get disabled.quoted
diff --git a/arch/arm64/kernel/relocate_kernel.S b/arch/arm64/kernel/relocate_kernel.S index 542d6ed..84eec95 100644 --- a/arch/arm64/kernel/relocate_kernel.S +++ b/arch/arm64/kernel/relocate_kernel.S@@ -36,18 +36,6 @@ SYM_CODE_START(arm64_relocate_new_kernel) mov x14, xzr /* x14 = entry ptr */ mov x13, xzr /* x13 = copy dest */ - /* Clear the sctlr_el2 flags. */ - mrs x0, CurrentEL - cmp x0, #CurrentEL_EL2 - b.ne 1f - mrs x0, sctlr_el2 - mov_q x1, SCTLR_ELx_FLAGS - bic x0, x0, x1 - pre_disable_mmu_workaround - msr sctlr_el2, x0 - isb -1:I agree this doesn't disable the MMU anymore. This was originally kept to disable the I+C bits when Kdump interrupted KVM, but since KVM formalised the hyp-stub API, and has this exact sequence to back its HVC_SOFT_RESTART, it was only needed for the hyp-stub itself, which has no clue about these SCTLR_EL2 bits. HVC_SOFT_RESTART only says it needs to disable the MMU. See Documentation/virt/kvm/arm/hyp-abi.rst I think its fine to remove this, but the reason is because el2_setup doesn't set those bits, and KVM clears them when its unloaded, or has a HVC_SOFT_RESTART call. It might be worth updating the document, but we'd need to check the guarantee is the same on 32bit. I assume there is no out-of-tree user of the hyp-stub abi. I don't think the E2H register redirection has anything to do with this. Thanks, James
_______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel