Re: [RFC PATCH 7/7] arm64/efi: Call EFI runtime services without disabling preemption
From: Peter Zijlstra <peterz@infradead.org>
Date: 2025-07-14 10:55:25
Also in:
linux-efi, lkml
On Mon, Jul 14, 2025 at 12:20:30PM +1000, Ard Biesheuvel wrote:
On Fri, 11 Jul 2025 at 23:48, Peter Zijlstra [off-list ref] wrote:quoted
On Wed, May 14, 2025 at 07:43:47PM +0200, Ard Biesheuvel wrote:quoted
From: Ard Biesheuvel <ardb@kernel.org> The only remaining reason why EFI runtime services are invoked with preemption disabled is the fact that the mm is swapped out behind the back of the context switching code. The kernel no longer disables preemption in kernel_neon_begin(). Furthermore, the EFI spec is being clarified to explicitly state that only baseline FP/SIMD is permitted in EFI runtime service implementations, and so the existing kernel mode NEON context switching code is sufficient to preserve and restore the execution context of an in-progress EFI runtime service call. Most EFI calls are made from the efi_rts_wq, which is serviced by a kthread. As kthreads never return to user space, they usually don't have an mm, and so we can use the existing infrastructure to swap in the efi_mm while the EFI call is in progress. This is visible to the scheduler, which will therefore reactivate the selected mm when switching out the kthread and back in again. Given that the EFI spec explicitly permits runtime services to be called with interrupts enabled, firmware code is already required to tolerate interruptions. So rather than disable preemption, disable only migration so that EFI runtime services are less likely to cause scheduling delays. Note, though, that the firmware executes at the same privilege level as the kernel, and is therefore able to disable interrupts altogether.Is the migrate_disable() strictly required, or just paranoia?Runtime services might be polling the secure firmware for an async completion when they are interrupted, and so I don't think it is generally safe to assume that an interrupted EFI runtime service can be resumed on another CPU.
Can we please get a comment with that migrate_disable() explaining this? Thanks!