Thread (4 messages) 4 messages, 2 authors, 2025-06-09

Re: [PATCH v2] tracing: Fix regression of filter waiting a long time on RCU synchronization

From: Steven Rostedt <rostedt@goodmis.org>
Date: 2025-06-09 15:38:36
Also in: lkml

On Mon, 9 Jun 2025 05:21:19 -0700
"Paul E. McKenney" [off-list ref] wrote:
quoted
+static void free_filter_list(struct rcu_head *rhp)
+{
+	struct filter_head *filter_list = container_of(rhp, struct filter_head, rcu);
+	struct filter_list *filter_item, *tmp;
+
+	list_for_each_entry_safe(filter_item, tmp, &filter_list->list, list) {
+		__free_filter(filter_item->filter);
+		list_del(&filter_item->list);
+		kfree(filter_item);
+	}  
If the series of synchronize_rcu() calls was taking 99 seconds,
we might well be going through the above loop quite a few times, all
within a bh-disabled region of code.  For example, the calls from both
filter_free_subsystem_filters() and process_system_preds() look like
they can queue a very large number of entries.
The iteration is for each event in a subsystem.

Seeing what that is:

~# for sys in /sys/kernel/tracing/events/* ; do cnt=`ls -F $sys |grep '/$' | wc -l`; echo $cnt $sys; done | sort -n | tail
24 /sys/kernel/tracing/events/tcp
26 /sys/kernel/tracing/events/xen
28 /sys/kernel/tracing/events/sched
32 /sys/kernel/tracing/events/writeback
33 /sys/kernel/tracing/events/libata
34 /sys/kernel/tracing/events/irq_vectors
55 /sys/kernel/tracing/events/xhci-hcd
92 /sys/kernel/tracing/events/kvm
113 /sys/kernel/tracing/events/ext4
724 /sys/kernel/tracing/events/syscalls

So the most loops that can happen is 724 on the syscalls plus the system
filter. And adding in printks to count the loops, I verified that:

[..]
[  100.019209] LOOPED 4 times
[  100.020033] LOOPED 5 times
[  100.020791] LOOPED 8 times
[  100.021583] LOOPED 10 times
[  100.022239] LOOPED 2 times
[  100.110515] LOOPED 725 times
[  100.111520] LOOPED 4 times
[  100.112889] LOOPED 25 times
[  100.114416] LOOPED 6 times
[  100.115950] LOOPED 4 times
[  100.117557] LOOPED 7 times
[..]

(724 event filters and one system event filter = 725 filters).

If you feel that's too big, we could move it to a work queue.

As more system calls get added, that number will grow. But it is limited to
the number of system calls.

All the event systems are way fewer than that.
So...
quoted
+	kfree(filter_list);
+}
+
+static void free_filter_list_tasks(struct rcu_head *rhp)
+{
+	call_rcu(rhp, free_filter_list);  
...why not use queue_rcu_work() in place of call_rcu() here, thus avoiding
possible response-time issues?

Or is there something that I am missing that limits the length of the
list passed to free_filter_list()?
Just the number of events in a system.

I could switch to a work queue but I don't think it needs to go as a fix,
do you? That is, it can wait till the next merge window. Or is this going
to affect things now?

-- Steve
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help