Re: [PATCH v8 02/10] x86/bhi: Make clear_bhb_loop() effective on newer CPUs
From: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Date: 2026-03-25 22:29:46
Also in:
bpf, kvm, linux-doc, lkml
On Wed, Mar 25, 2026 at 07:41:46PM +0000, David Laight wrote:
On Wed, 25 Mar 2026 10:50:58 -0700 Jim Mattson [off-list ref] wrote:quoted
On Tue, Mar 24, 2026 at 11:19 AM Pawan Gupta [off-list ref] wrote:quoted
As a mitigation for BHI, clear_bhb_loop() executes branches that overwrites the Branch History Buffer (BHB). On Alder Lake and newer parts this sequence is not sufficient because it doesn't clear enough entries. This was not an issue because these CPUs have a hardware control (BHI_DIS_S) that mitigates BHI in kernel. BHI variant of VMSCAPE requires isolating branch history between guests and userspace. Note that there is no equivalent hardware control for userspace. To effectively isolate branch history on newer CPUs, clear_bhb_loop() should execute sufficient number of branches to clear a larger BHB. Dynamically set the loop count of clear_bhb_loop() such that it is effective on newer CPUs too. Use the hardware control enumeration X86_FEATURE_BHI_CTRL to select the appropriate loop count. Suggested-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Nikolay Borisov <redacted> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> --- arch/x86/entry/entry_64.S | 21 ++++++++++++++++----- arch/x86/net/bpf_jit_comp.c | 7 ------- 2 files changed, 16 insertions(+), 12 deletions(-)diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index 3a180a36ca0e..8128e00ca73f 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S@@ -1535,8 +1535,17 @@ SYM_CODE_END(rewind_stack_and_make_dead) SYM_FUNC_START(clear_bhb_loop) ANNOTATE_NOENDBR push %rbp + /* BPF caller may require %rax to be preserved */Since you need a new version change that to 'all registers preserved'.
Ya, thats more accurate.
quoted
quoted
+ push %raxShouldn't the "push %rax" come after "mov %rsp, %rbp"?Or delete the stack frame :-) It is only there for the stack trace-back code.
Hmm, lets keep the stack frame, it might help debug.