Thread (4 messages) 4 messages, 2 authors, 2025-06-09

Re: [PATCH v2] tracing: Fix regression of filter waiting a long time on RCU synchronization

From: "Paul E. McKenney" <paulmck@kernel.org>
Date: 2025-06-09 16:09:36
Also in: lkml

On Mon, Jun 09, 2025 at 11:39:55AM -0400, Steven Rostedt wrote:
On Mon, 9 Jun 2025 05:21:19 -0700
"Paul E. McKenney" [off-list ref] wrote:
quoted
quoted
+static void free_filter_list(struct rcu_head *rhp)
+{
+	struct filter_head *filter_list = container_of(rhp, struct filter_head, rcu);
+	struct filter_list *filter_item, *tmp;
+
+	list_for_each_entry_safe(filter_item, tmp, &filter_list->list, list) {
+		__free_filter(filter_item->filter);
+		list_del(&filter_item->list);
+		kfree(filter_item);
+	}  
If the series of synchronize_rcu() calls was taking 99 seconds,
we might well be going through the above loop quite a few times, all
within a bh-disabled region of code.  For example, the calls from both
filter_free_subsystem_filters() and process_system_preds() look like
they can queue a very large number of entries.
The iteration is for each event in a subsystem.

Seeing what that is:

~# for sys in /sys/kernel/tracing/events/* ; do cnt=`ls -F $sys |grep '/$' | wc -l`; echo $cnt $sys; done | sort -n | tail
24 /sys/kernel/tracing/events/tcp
26 /sys/kernel/tracing/events/xen
28 /sys/kernel/tracing/events/sched
32 /sys/kernel/tracing/events/writeback
33 /sys/kernel/tracing/events/libata
34 /sys/kernel/tracing/events/irq_vectors
55 /sys/kernel/tracing/events/xhci-hcd
92 /sys/kernel/tracing/events/kvm
113 /sys/kernel/tracing/events/ext4
724 /sys/kernel/tracing/events/syscalls

So the most loops that can happen is 724 on the syscalls plus the system
filter. And adding in printks to count the loops, I verified that:

[..]
[  100.019209] LOOPED 4 times
[  100.020033] LOOPED 5 times
[  100.020791] LOOPED 8 times
[  100.021583] LOOPED 10 times
[  100.022239] LOOPED 2 times
[  100.110515] LOOPED 725 times
[  100.111520] LOOPED 4 times
[  100.112889] LOOPED 25 times
[  100.114416] LOOPED 6 times
[  100.115950] LOOPED 4 times
[  100.117557] LOOPED 7 times
[..]

(724 event filters and one system event filter = 725 filters).

If you feel that's too big, we could move it to a work queue.
That does sound like a bit more than would be good.
As more system calls get added, that number will grow. But it is limited to
the number of system calls.

All the event systems are way fewer than that.
quoted
So...
quoted
+	kfree(filter_list);
+}
+
+static void free_filter_list_tasks(struct rcu_head *rhp)
+{
+	call_rcu(rhp, free_filter_list);  
...why not use queue_rcu_work() in place of call_rcu() here, thus avoiding
possible response-time issues?

Or is there something that I am missing that limits the length of the
list passed to free_filter_list()?
Just the number of events in a system.

I could switch to a work queue but I don't think it needs to go as a fix,
do you? That is, it can wait till the next merge window. Or is this going
to affect things now?
Thank youi, and the next merge window should be fine.

							Thanx, Paul
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help