Re: [PATCH v7 00/10] fprobe: Introduce fprobe function entry/exit probe
From: Masami Hiramatsu <mhiramat@kernel.org>
Date: 2022-02-02 07:29:33
Also in:
bpf, lkml
Hi Jiri, On Wed, 2 Feb 2022 01:02:43 +0100 Jiri Olsa [off-list ref] wrote:
On Mon, Jan 31, 2022 at 02:00:24PM +0900, Masami Hiramatsu wrote:quoted
Hi, Here is the 7th version of fprobe. This version fixes unregister_fprobe() ensures that exit_handler is not called after returning from the unregister_fprobe(), and fixes some comments and documents. The previous version is here[1]; [1] https://lore.kernel.org/all/164338031590.2429999.6203979005944292576.stgit@devnote2/T/#u (local) This series introduces the fprobe, the function entry/exit probe with multiple probe point support. This also introduces the rethook for hooking function return as same as the kretprobe does. This abstraction will help us to generalize the fgraph tracer, because we can just switch to it from the rethook in fprobe, depending on the kernel configuration. The patch [1/10] is from Jiri's series[2]. [2] https://lore.kernel.org/all/20220104080943.113249-1-jolsa@kernel.org/T/#u (local) And the patch [9/10] adds the FPROBE_FL_KPROBE_SHARED flag for the case if user wants to share the same code (or share a same resource) on the fprobe and the kprobes.hi, it works fine for bpf selftests, but when I use it through bpftrace to attach more probes with: # ./src/bpftrace -e 'kprobe:ksys_* { }' Attaching 27 probes I'm getting stalls like: krava33 login: [ 988.574069] INFO: task bpftrace:4137 blocked for more than 122 seconds. [ 988.577577] Not tainted 5.16.0+ #89 [ 988.580173] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 988.585538] task:bpftrace state:D stack: 0 pid: 4137 ppid: 4123 flags:0x00004004 [ 988.589869] Call Trace: [ 988.591312] <TASK> [ 988.592577] __schedule+0x3a8/0xd30 [ 988.594469] ? wait_for_completion+0x84/0x110 [ 988.596753] schedule+0x4e/0xc0 [ 988.598480] schedule_timeout+0xed/0x130 [ 988.600524] ? rcu_read_lock_sched_held+0x12/0x70 [ 988.602901] ? lock_release+0x253/0x4a0 [ 988.604935] ? lock_acquired+0x1b7/0x410 [ 988.607041] ? trace_hardirqs_on+0x1b/0xe0 [ 988.609202] wait_for_completion+0xae/0x110 [ 988.613762] __wait_rcu_gp+0x127/0x130 [ 988.615787] synchronize_rcu_tasks_generic+0x46/0xa0 [ 988.618329] ? call_rcu_tasks+0x20/0x20 [ 988.620600] ? rcu_tasks_pregp_step+0x10/0x10 [ 988.623232] ftrace_shutdown.part.0+0x174/0x210 [ 988.625820] unregister_ftrace_function+0x37/0x60 [ 988.628480] unregister_fprobe+0x2d/0x50 [ 988.630928] bpf_link_free+0x4e/0x70 [ 988.633126] bpf_link_release+0x11/0x20 [ 988.635249] __fput+0xae/0x270 [ 988.637022] task_work_run+0x5c/0xa0 [ 988.639016] exit_to_user_mode_prepare+0x251/0x260 [ 988.641294] syscall_exit_to_user_mode+0x16/0x50 [ 988.646249] do_syscall_64+0x48/0x90 [ 988.648218] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 988.650787] RIP: 0033:0x7f9079e95fbb [ 988.652761] RSP: 002b:00007ffd474fa3b0 EFLAGS: 00000293 ORIG_RAX: 0000000000000003 [ 988.656718] RAX: 0000000000000000 RBX: 00000000011bf8d0 RCX: 00007f9079e95fbb [ 988.660110] RDX: 0000000000000000 RSI: 00007ffd474fa3b0 RDI: 0000000000000019 [ 988.663512] RBP: 00007ffd474faaf0 R08: 0000000000000000 R09: 000000000000001a [ 988.666673] R10: 0000000000000064 R11: 0000000000000293 R12: 0000000000000001 [ 988.669770] R13: 00000000004a19a1 R14: 00007f9083428c00 R15: 00000000008c02d8 [ 988.672601] </TASK> [ 988.675763] INFO: lockdep is turned off. I have't investigated yet, any idea?
Hmm, no, as far as I tested with my example module, it works well as below; # insmod fprobe_example.ko symbol='ksys_*' && ls && sleep 1 && rmmod fprobe_example.ko [ 125.820113] fprobe_init: Planted fprobe at ksys_* [ 125.823153] sample_entry_handler: Enter <ksys_write+0x0/0xf0> ip = 0x000000008d8da91f [ 125.824247] fprobe_handler.part.0+0xb1/0x150 [ 125.825024] fprobe_handler+0x1e/0x20 [ 125.825799] 0xffffffffa000e0e3 [ 125.826540] ksys_write+0x5/0xf0 [ 125.827344] do_syscall_64+0x3b/0x90 [ 125.828144] entry_SYSCALL_64_after_hwframe+0x44/0xae fprobe_example.ko [ 125.829178] sample_exit_handler: Return from <ksys_write+0x0/0xf0> ip = 0x000000008d8da91f to rip = 0x00000000be5e197e (__x64_sys_write+0x1a/0x20) [ 125.830707] fprobe_exit_handler+0x29/0x30 [ 125.831415] rethook_trampoline_handler+0x99/0x140 [ 125.832259] arch_rethook_trampoline_callback+0x3f/0x50 [ 125.833110] arch_rethook_trampoline+0x2f/0x50 [ 125.833803] __x64_sys_write+0x1a/0x20 [ 125.834448] do_syscall_64+0x3b/0x90 [ 125.835055] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 126.878825] fprobe_exit: fprobe at ksys_* unregistered # Even with NR_CPUS=3, it didn't cause the stall. But maybe you'd better test with Paul's fix as Andrii pointed. Thank you,
thanks, jirka
-- Masami Hiramatsu [off-list ref]