Re: [PATCH] powerpc/pseries: Fix scv instruction crash with kexec
From: Michael Ellerman <mpe@ellerman.id.au>
Date: 2024-07-09 13:04:01
Subsystem:
linux for powerpc (32-bit and 64-bit), the rest · Maintainers:
Madhavan Srinivasan, Michael Ellerman, Linus Torvalds
Michal Suchánek [off-list ref] writes:
Hello, On Tue, Jun 25, 2024 at 11:40:47PM +1000, Nicholas Piggin wrote:quoted
kexec on pseries disables AIL (reloc_on_exc), required for scv instruction support, before other CPUs have been shut down. This means they can execute scv instructions after AIL is disabled, which causes an interrupt at an unexpected entry location that crashes the kernel. Change the kexec sequence to disable AIL after other CPUs have been brought down. As a refresher, the real-mode scv interrupt vector is 0x17000, and the fixed-location head code probably couldn't easily deal with implementing such high addresses so it was just decided not to support that interrupt at all. Reported-by: Sourabh Jain <redacted> Fixes: 7fa95f9adaee7 ("powerpc/64s: system call support for scv/rfscv instructions")looks like this is only broken by commit 2ab2d5794f14 ("powerpc/kasan: Disable address sanitization in kexec paths") This change reverts the kexec parts done in that commit. That is the fix is 5.19+, not 5.9+
Commit 2ab2d5794f14 moved the kexec code from one file to another, but didn't change when the key function (pseries_disable_reloc_on_exc()) was called. The old code was:
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index a3dab15b0a2f..c9fcc30a0365 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c@@ -421,16 +421,6 @@ void pseries_disable_reloc_on_exc(void) } EXPORT_SYMBOL(pseries_disable_reloc_on_exc); -#ifdef CONFIG_KEXEC_CORE -static void pSeries_machine_kexec(struct kimage *image) -{ - if (firmware_has_feature(FW_FEATURE_SET_MODE)) - pseries_disable_reloc_on_exc(); - - default_machine_kexec(image); -} -#endif -
ie. pseries_disable_reloc_on_exc() (which disables AIL) is called before default_machine_kexec() where secondary CPUs are collected. So AFAICS the bug would still have been there prior to 2ab2d5794f14. But it's late here so I could be reading it wrong. cheers