Re: [PATCHv6 perf/core 09/22] uprobes/x86: Add uprobe syscall to speed up uprobe
From: Andrii Nakryiko <hidden>
Date: 2025-09-05 18:44:22
Also in:
bpf, lkml
On Fri, Sep 5, 2025 at 3:46 AM Jiri Olsa [off-list ref] wrote:
quoted hunk ↗ jump to hunk
On Thu, Sep 04, 2025 at 11:32:06AM -0700, Andrii Nakryiko wrote:quoted
On Thu, Sep 4, 2025 at 7:03 AM Jiri Olsa [off-list ref] wrote:quoted
On Thu, Sep 04, 2025 at 11:39:33AM +0200, Jann Horn wrote:quoted
On Thu, Sep 4, 2025 at 9:56 AM Jiri Olsa [off-list ref] wrote:quoted
On Wed, Sep 03, 2025 at 04:12:37PM -0700, Andrii Nakryiko wrote:quoted
On Wed, Sep 3, 2025 at 2:01 PM Peter Zijlstra [off-list ref] wrote:quoted
On Wed, Sep 03, 2025 at 10:56:10PM +0200, Jiri Olsa wrote:quoted
quoted
quoted
+SYSCALL_DEFINE0(uprobe) +{ + struct pt_regs *regs = task_pt_regs(current); + struct uprobe_syscall_args args; + unsigned long ip, sp; + int err; + + /* Allow execution only from uprobe trampolines. */ + if (!in_uprobe_trampoline(regs->ip)) + goto sigill;Hey Jiri, So I've been thinking what's the simplest and most reliable way to feature-detect support for this sys_uprobe (e.g., for libbpf to know whether we should attach at nop5 vs nop1), and clearly that would be to try to call uprobe() syscall not from trampoline, and expect some error code. How bad would it be to change this part to return some unique-enough error code (-ENXIO, -EDOM, whatever). Is there any reason not to do this? Security-wise it will be just fine, right?good question.. maybe :) the sys_uprobe sigill error path followed the uprobe logic when things go bad, seem like good idea to be strict I understand it'd make the detection code simpler, but it could just just fork and check for sigill, right?Can't you simply uprobe your own nop5 and read back the text to see what it turns into?Sure, but none of that is neither fast, nor cheap, nor that simple... (and requires elevated permissions just to detect) Forking is also resource-intensive. (think from libbpf's perspective, it's not cool for library to fork some application just to check such a seemingly simple thing as whether to The question is why all that? That SIGILL when !in_uprobe_trampoline() is just paranoid. I understand killing an application if it tries to screw up "protocol" in all the subsequent checks. But here it's equally secure to just fail that syscall with normal error, instead of punishing by death.adding Jann to the loop, any thoughts on this ^^^ ?If I understand correctly, the main reason for the SIGILL is that if you hit an error in here when coming from an actual uprobe, and if the syscall were to just return an error, then you'd end up not restoring registers as expected which would probably end up crashing the process in a pretty ugly way?for some cases yes, for the initial checks I think we could just skip the uprobe and process would continue just fineFor non-buggy kernel implementation in_uprobe_trampoline(regs->ip) will (should) always be true when triggered for kernel-installed uprobe. So this check can fail only for cases when someone intentionally called sys_uprobe not from kernel-generated and kernel-controlled trampoline. At which point it's totally fine to just return an error and do nothing.quoted
we use sigill because the trap code paths use it for errors and to be paranoid about the !in_uprobe_trampoline checkYeah, and it should be totally fine to keep doing that. It's just about that entry in_uprobe_trampoline() check. And that's sufficient to make all this nicely integrated with USDT use cases. (I'd say it would be nice to also amend this into original patch to avoid someone cherry picking original commit and forgetting/missing the follow up change. But that's up to Peter.) Jiri, can you please send a quick patch and see how that goes? Thanks!seems like it's as easy as the change below, I'll send formal patches later if I don't hear otherwise.. we will also need man page change jirka ---diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c index 0a8c0a4a5423..845aeaf36b8d 100644 --- a/arch/x86/kernel/uprobes.c +++ b/arch/x86/kernel/uprobes.c@@ -810,7 +810,7 @@ SYSCALL_DEFINE0(uprobe) /* Allow execution only from uprobe trampolines. */ if (!in_uprobe_trampoline(regs->ip)) - goto sigill; + return -ENXIO;
thanks! Acked-by: Andrii Nakryiko <andrii@kernel.org>
quoted hunk ↗ jump to hunk
err = copy_from_user(&args, (void __user *)regs->sp, sizeof(args)); if (err)diff --git a/tools/testing/selftests/bpf/prog_tests/uprobe_syscall.c b/tools/testing/selftests/bpf/prog_tests/uprobe_syscall.c index 5da0b49eeaca..6d75ede16e7c 100644 --- a/tools/testing/selftests/bpf/prog_tests/uprobe_syscall.c +++ b/tools/testing/selftests/bpf/prog_tests/uprobe_syscall.c@@ -757,34 +757,12 @@ static void test_uprobe_race(void) #define __NR_uprobe 336 #endif -static void test_uprobe_sigill(void) +static void test_uprobe_error(void) { - int status, err, pid; + long err = syscall(__NR_uprobe); - pid = fork(); - if (!ASSERT_GE(pid, 0, "fork")) - return; - /* child */ - if (pid == 0) { - asm volatile ( - "pushq %rax\n" - "pushq %rcx\n" - "pushq %r11\n" - "movq $" __stringify(__NR_uprobe) ", %rax\n" - "syscall\n" - "popq %r11\n" - "popq %rcx\n" - "retq\n" - ); - exit(0); - } - - err = waitpid(pid, &status, 0); - ASSERT_EQ(err, pid, "waitpid"); - - /* verify the child got killed with SIGILL */ - ASSERT_EQ(WIFSIGNALED(status), 1, "WIFSIGNALED"); - ASSERT_EQ(WTERMSIG(status), SIGILL, "WTERMSIG"); + ASSERT_EQ(err, -1, "error"); + ASSERT_EQ(errno, ENXIO, "errno"); } static void __test_uprobe_syscall(void)@@ -805,8 +783,8 @@ static void __test_uprobe_syscall(void) test_uprobe_usdt(); if (test__start_subtest("uprobe_race")) test_uprobe_race(); - if (test__start_subtest("uprobe_sigill")) - test_uprobe_sigill(); + if (test__start_subtest("uprobe_error")) + test_uprobe_error(); if (test__start_subtest("uprobe_regs_equal")) test_uprobe_regs_equal(false); if (test__start_subtest("regs_change"))