Thread (15 messages) 15 messages, 4 authors, 2025-12-08

Re: [RFC PATCH 5/8] uprobe/x86: Add support to optimize on top of emulated instructions

From: Jiri Olsa <hidden>
Date: 2025-11-26 07:54:22
Also in: bpf, lkml

On Mon, Nov 24, 2025 at 07:01:14PM +0100, Oleg Nesterov wrote:
Hi Jiri,

I am trying to understand this series, will try to read it more carefully
later...

(damn why do you always send the patches when I am on PTO? ;)
it's more fun that way ;-) thanks for checking on it
On 11/17, Jiri Olsa wrote:
quoted
 struct arch_uprobe {
 	union {
-		u8			insn[MAX_UINSN_BYTES];
+		u8			insn[5*MAX_UINSN_BYTES];
Hmm. OK, this matches the "for (i = 0; i < 5; i++)" loop in
opt_setup_xol_ops(), but do we really need this change? Please see
the question at the end.
quoted
+static int opt_setup_xol_ops(struct arch_uprobe *auprobe, struct insn *insn)
+{
+	unsigned long offset = insn->length;
+	struct insn insnX;
+	int i, ret;
+
+	if (test_bit(ARCH_UPROBE_FLAG_CAN_OPTIMIZE, &auprobe->flags))
+		return -ENOSYS;
I think this logic needs some cleanups... If ARCH_UPROBE_FLAG_CAN_OPTIMIZE
is set by the caller, the it doesn't make sense to call xxx_setup_xol_ops(),
right? But lets forget it for now.
quoted
+	ret = opt_setup_xol_insns(auprobe, &auprobe->opt.xol[0], insn);
I think this should go into the main loop, see below
quoted
+	for (i = 1; i < 5; i++) {
+		ret = uprobe_init_insn_offset(auprobe, offset, &insnX, true);
+		if (ret)
+			break;
+		ret = opt_setup_xol_insns(auprobe, &auprobe->opt.xol[i], &insnX);
+		if (ret)
+			break;
+		offset += insnX.length;
+		auprobe->opt.cnt++;
+		if (offset >= 5)
+			goto optimize;
+	}
+
+	return -ENOSYS;
I don't think -ENOSYS makes sense if opt_setup_xol_insns() succeeds at least once.
IOW, how about

	static int opt_setup_xol_ops(struct arch_uprobe *auprobe, struct insn *insn)
	{
		unsigned long offset = 0;
		struct insn insnX;
		int i, ret;

		if (test_bit(ARCH_UPROBE_FLAG_CAN_OPTIMIZE, &auprobe->flags))
			return -ENOSYS;

		for (i = 0; i < 5; i++) {
			ret = opt_setup_xol_insns(auprobe, &auprobe->opt.xol[i], insn);
			if (ret)
				break;
			offset += insn->length;
			if (offset >= 5)
				break;

			insn = &insnX;
			ret = uprobe_init_insn_offset(auprobe, offset, insn, true);
			if (ret)
				break;
		}

		if (!offset)
			return -ENOSYS;

		if (offset >= 5) {
			auprobe->opt.cnt = i + 1;
			auprobe->xol.ops = &opt_xol_ops;
			set_bit(ARCH_UPROBE_FLAG_CAN_OPTIMIZE, &auprobe->flags);
			set_bit(ARCH_UPROBE_FLAG_OPTIMIZE_EMULATE, &auprobe->flags);
		}

		return 0;
	}

?

This way the caller, arch_uprobe_analyze_insn(), doesn't need to call
push/mov/sub/_setup_xol_ops(), and the code looks a bit simpler to me.
ah nice, will try that
No?
quoted
+      * TODO perhaps we could 'emulate' nop, so there would be no need for
+      * ARCH_UPROBE_FLAG_OPTIMIZE_EMULATE flag, because we would emulate
+      * allways.
Agreed... and this connects to "this logic needs some cleanups" above.
I guess we need nop_setup_xol_ops() extracted from branch_setup_xol_ops()
but again, lets forget it for now.
ok, it will hopefully make the code simpler, will check on that
-------------------------------------------------------------------------------
Now the main question. What if we avoid this change

	-             u8                      insn[MAX_UINSN_BYTES];
	+             u8                      insn[5*MAX_UINSN_BYTES];

mentioned above, and change opt_setup_xol_ops() to just do

	-	for (i = 0; i < 5; i++)
	+	for (i = 0;; i++)

?

The main loop stops when offset >= 5 anyway.
And. if auprobe->insn[offset:MAX_UINSN_BYTES] doesn't contain a full/valid
insn at the start, then uprobe_init_insn_offset()->insn_decode() should fail?

Most probably I missed something, but I can't understand this part.
no, I think you're right, I did not realize we fit under MAX_UINSN_BYTES
anyway, call instruction needs only 5 bytes

thanks,
jirka
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help