Re: nop5-optimized USDTs WAS: Re: [PATCHv6 perf/core 09/22] uprobes/x86: Add uprobe syscall to speed up uprobe
From: Peter Zijlstra <peterz@infradead.org>
Date: 2025-09-04 20:52:31
Also in:
bpf, lkml
From: Peter Zijlstra <peterz@infradead.org>
Date: 2025-09-04 20:52:31
Also in:
bpf, lkml
On Thu, Sep 04, 2025 at 01:49:49PM -0700, Andrii Nakryiko wrote:
On Thu, Sep 4, 2025 at 1:35 PM Peter Zijlstra [off-list ref] wrote:quoted
On Thu, Sep 04, 2025 at 11:27:45AM -0700, Andrii Nakryiko wrote:quoted
quoted
quoted
So I've been thinking what's the simplest and most reliable way to feature-detect support for this sys_uprobe (e.g., for libbpf to know whether we should attach at nop5 vs nop1), and clearly that would bewrt nop5/nop1.. so the idea is to have USDT macro emit both nop1,nop5 and store some info about that in the usdt's elf note, right?Wait, what? You're doing to emit 6 bytes and two nops? Why? Surely the old kernel can INT3 on top of a NOP5?Yes it can, but it's 2x slower in terms of uprobe triggering compared to nop1.
Why? That doesn't really make sense. I realize its probably to late to fix the old kernel not to be stupid -- this must be something stupid, right? But now I need to know.