Re: [RFC PATCH 5/8] uprobe/x86: Add support to optimize on top of emulated instructions
From: Jiri Olsa <hidden>
Date: 2025-11-26 07:54:22
Also in:
bpf, lkml
On Mon, Nov 24, 2025 at 07:01:14PM +0100, Oleg Nesterov wrote:
Hi Jiri, I am trying to understand this series, will try to read it more carefully later... (damn why do you always send the patches when I am on PTO? ;)
it's more fun that way ;-) thanks for checking on it
On 11/17, Jiri Olsa wrote:quoted
struct arch_uprobe { union { - u8 insn[MAX_UINSN_BYTES]; + u8 insn[5*MAX_UINSN_BYTES];Hmm. OK, this matches the "for (i = 0; i < 5; i++)" loop in opt_setup_xol_ops(), but do we really need this change? Please see the question at the end.quoted
+static int opt_setup_xol_ops(struct arch_uprobe *auprobe, struct insn *insn) +{ + unsigned long offset = insn->length; + struct insn insnX; + int i, ret; + + if (test_bit(ARCH_UPROBE_FLAG_CAN_OPTIMIZE, &auprobe->flags)) + return -ENOSYS;I think this logic needs some cleanups... If ARCH_UPROBE_FLAG_CAN_OPTIMIZE is set by the caller, the it doesn't make sense to call xxx_setup_xol_ops(), right? But lets forget it for now.quoted
+ ret = opt_setup_xol_insns(auprobe, &auprobe->opt.xol[0], insn);I think this should go into the main loop, see belowquoted
+ for (i = 1; i < 5; i++) { + ret = uprobe_init_insn_offset(auprobe, offset, &insnX, true); + if (ret) + break; + ret = opt_setup_xol_insns(auprobe, &auprobe->opt.xol[i], &insnX); + if (ret) + break; + offset += insnX.length; + auprobe->opt.cnt++; + if (offset >= 5) + goto optimize; + } + + return -ENOSYS;I don't think -ENOSYS makes sense if opt_setup_xol_insns() succeeds at least once. IOW, how about static int opt_setup_xol_ops(struct arch_uprobe *auprobe, struct insn *insn) { unsigned long offset = 0; struct insn insnX; int i, ret; if (test_bit(ARCH_UPROBE_FLAG_CAN_OPTIMIZE, &auprobe->flags)) return -ENOSYS; for (i = 0; i < 5; i++) { ret = opt_setup_xol_insns(auprobe, &auprobe->opt.xol[i], insn); if (ret) break; offset += insn->length; if (offset >= 5) break; insn = &insnX; ret = uprobe_init_insn_offset(auprobe, offset, insn, true); if (ret) break; } if (!offset) return -ENOSYS; if (offset >= 5) { auprobe->opt.cnt = i + 1; auprobe->xol.ops = &opt_xol_ops; set_bit(ARCH_UPROBE_FLAG_CAN_OPTIMIZE, &auprobe->flags); set_bit(ARCH_UPROBE_FLAG_OPTIMIZE_EMULATE, &auprobe->flags); } return 0; } ? This way the caller, arch_uprobe_analyze_insn(), doesn't need to call push/mov/sub/_setup_xol_ops(), and the code looks a bit simpler to me.
ah nice, will try that
No?quoted
+ * TODO perhaps we could 'emulate' nop, so there would be no need for + * ARCH_UPROBE_FLAG_OPTIMIZE_EMULATE flag, because we would emulate + * allways.Agreed... and this connects to "this logic needs some cleanups" above. I guess we need nop_setup_xol_ops() extracted from branch_setup_xol_ops() but again, lets forget it for now.
ok, it will hopefully make the code simpler, will check on that
------------------------------------------------------------------------------- Now the main question. What if we avoid this change - u8 insn[MAX_UINSN_BYTES]; + u8 insn[5*MAX_UINSN_BYTES]; mentioned above, and change opt_setup_xol_ops() to just do - for (i = 0; i < 5; i++) + for (i = 0;; i++) ? The main loop stops when offset >= 5 anyway.
And. if auprobe->insn[offset:MAX_UINSN_BYTES] doesn't contain a full/valid insn at the start, then uprobe_init_insn_offset()->insn_decode() should fail? Most probably I missed something, but I can't understand this part.
no, I think you're right, I did not realize we fit under MAX_UINSN_BYTES anyway, call instruction needs only 5 bytes thanks, jirka