Re: BPF tracing trampoline synchronization between update/freeing and execution?
From: Alexei Starovoitov <hidden>
Date: 2020-01-06 22:29:16
Also in:
bpf, lkml
On Mon, Jan 06, 2020 at 05:56:54PM +0100, Peter Zijlstra wrote:
On Mon, Jan 06, 2020 at 05:39:30PM +0100, Jann Horn wrote:quoted
Hi! I was chatting with kpsingh about BPF trampolines, and I noticed that it looks like BPF trampolines (as of current bpf-next/master) seem to be missing synchronization between trampoline code updates and trampoline execution. Or maybe I'm missing something? If I understand correctly, trampolines are executed directly from the fentry placeholders at the start of arbitrary kernel functions, so they can run without any locks held. So for example, if task A starts executing a trampoline on entry to sys_open(), then gets preempted in the middle of the trampoline, and then task B quickly calls BPF_RAW_TRACEPOINT_OPEN twice, and then task A continues execution, task A will end up executing the middle of newly-written machine code, which can probably end up crashing the kernel somehow? I think that at least to synchronize trampoline text freeing with concurrent trampoline execution, it is necessary to do something similar to what the livepatching code does with klp_check_stack(), and then either use a callback from the scheduler to periodically re-check tasks that were in the trampoline or let the trampoline tail-call into a cleanup helper that is part of normal kernel text. And you'd probably have to gate BPF trampolines on CONFIG_HAVE_RELIABLE_STACKTRACE.ftrace uses synchronize_rcu_tasks() to flip between trampolines iirc.
good catch and good suggestion. synchronize_rcu_tasks() is needed here too.