[LSF/MM/BPF TOPIC] faster uprobes
From: Jiri Olsa <hidden>
Date: 2024-02-29 14:39:29
One of uprobe pain points is having slow execution that involves
two traps in worst case scenario or single trap if the original
instruction can be emulated. For return uprobes there's one extra
trap on top of that.
My current idea on how to make this faster is to follow the optimized
kprobes and replace the normal uprobe trap instruction with jump to
user space trampoline that:
- executes syscall to call uprobe consumers callbacks
- executes original instructions
- jumps back to continue with the original code
There are of course corner cases where above will have trouble or
won't work completely, like:
- executing original instructions in the trampoline is tricky wrt
rip relative addressing
- some instructions we can't move to trampoline at all
- the uprobe address is on page boundary so the jump instruction to
trampoline would span across 2 pages, hence the page replace won't
be atomic, which might cause issues
- ... ? many others I'm sure
Still with all the limitations I think we could be able to speed up
some amount of the uprobes, which seems worth doing.
I'd like to have the discussion on the topic and get some agreement
or directions on how this should be done.