Re: [PATCH v10 06/14] unwind_user/deferred: Add deferred unwinding interface
From: Peter Zijlstra <peterz@infradead.org>
Date: 2025-06-19 07:50:23
Also in:
bpf, lkml
On Wed, Jun 18, 2025 at 03:09:15PM -0400, Steven Rostedt wrote:
On Wed, 18 Jun 2025 20:46:20 +0200 Peter Zijlstra [off-list ref] wrote:quoted
quoted
+struct unwind_work; + +typedef void (*unwind_callback_t)(struct unwind_work *work, struct unwind_stacktrace *trace, u64 timestamp); + +struct unwind_work { + struct list_head list;Does this really need to be a list? Single linked list like callback_head not good enough?Doesn't a list head make it easier to remove without having to iterate the list?
Yeah, but why would you ever want to remove it? You asked for an unwind, you get an unwind, no?
quoted
quoted
static __always_inline void unwind_exit_to_user_mode(void) { if (unlikely(current->unwind_info.cache)) current->unwind_info.cache->nr_entries = 0; + current->unwind_info.timestamp = 0;Surely clearing that timestamp is only relevant when there is a cache around? Better to not add this unconditional write to the exit path.That's actually not quite true. If the allocation fails, we still want to clear the timestamp. But later patches add more data to check and it does exit out if there's been no requests:
Well, you could put in an error value on alloc fail or somesuch. Then its non-zero.
But for better reviewing, I could add a comment in this patch that states that this will eventually exit out early when it does more work.
You're making this really hard to review, why not do it right from the get-go?
quoted
quoted
+/* Guards adding to and reading the list of callbacks */ +static DEFINE_MUTEX(callback_mutex); +static LIST_HEAD(callbacks);Global state.. smells like failure.Yes, the unwind infrastructure is global, as it is the way tasks know what tracer's callbacks to call.
Well, that's apparently how you've set it up. I don't immediately see
this has to be like this.
And there's no comments no nothing.
I don't see why you can't have something like:
struct unwind_work {
struct callback_head task_work;
void *data;
void (*func)(struct unwind_work *work, void *data);
};
void unwind_task_work_func(struct callback_head *task_work)
{
struct unwind_work *uw = container_of(task_work, struct unwind_work, task_work);
// do actual unwind
uw->func(uw, uw->data);
}
or something along those lines. No global state involved.
quoted
quoted
+ guard(mutex)(&callback_mutex); + list_for_each_entry(work, &callbacks, list) { + work->func(work, &trace, timestamp); + }So now you're globally serializing all return-to-user instances. How is that not a problem?It was the original way we did things. The next patch changes this to SRCU. But it requires a bit more care. For breaking up the series, I preferred not to add that logic and make it a separate patch. For better reviewing, I'll add a comment here that says: /* TODO switch this global lock to SRCU */
Oh ffs :-( So splitting up patches is for ease of review, but now you're making splits that make review harder, how does that make sense?