Thread (67 messages) 67 messages, 9 authors, 2024-07-10

Re: [PATCH v2 00/12] uprobes: add batched register/unregister APIs and per-CPU RW semaphore

From: Peter Zijlstra <peterz@infradead.org>
Date: 2024-07-03 08:07:45
Also in: bpf, lkml

On Tue, Jul 02, 2024 at 09:47:41PM -0700, Andrii Nakryiko wrote:
quoted
As you noted, that percpu-rwsem write side is quite insane. And you're
creating this batch complexity to mitigate that.

Note that batch API is needed regardless of percpu RW semaphore or
not. As I mentioned, once uprobes_treelock is mitigated one way or the
other, the next one is uprobe->register_rwsem. For scalability, we
need to get rid of it and preferably not add any locking at all. So
tentatively I'd like to have lockless RCU-protected iteration over
uprobe->consumers list and call consumer->handler(). This means that
on uprobes_unregister we'd need synchronize_rcu (for whatever RCU
flavor we end up using), to ensure that we don't free uprobe_consumer
memory from under handle_swbp() while it is actually triggering
consumers.

So, without batched unregistration we'll be back to the same problem
I'm solving here: doing synchronize_rcu() for each attached uprobe one
by one is prohibitively slow. We went through this exercise with
ftrace/kprobes already and fixed it with batched APIs. Doing that for
uprobes seems unavoidable as well.
I'm not immediately seeing how you need that terrible refcount stuff for
the batching though. If all you need is group a few unregisters together
in order to share a sync_rcu() that seems way overkill.

You seem to have muddled the order of things, which makes the actual
reason for doing things utterly unclear.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help