Re: [PATCH v11 06/14] unwind_user/deferred: Add deferred unwinding interface
From: Steven Rostedt <rostedt@goodmis.org>
Date: 2025-06-26 20:34:38
Also in:
bpf, lkml
Subsystem:
the rest, userspace stack unwinding · Maintainers:
Linus Torvalds, Josh Poimboeuf, Steven Rostedt
On Thu, 26 Jun 2025 12:48:55 -0400 Steven Rostedt [off-list ref] wrote:
quoted
static __always_inline void unwind_reset_info(void) { - if (unlikely(current->unwind_info.cache)) + /* Exit out early if this was never used */ + if (likely(!current->unwind_info.timestamp)) + return;I found that this breaks the use of perf using the unwind_user_faultable() directly and not relying on the deferred infrastructure (which it does when it traces a single task and also needs to remove the separate in_nmi() code). Because this still requires the nr_entries to be set to zero. The clearing of the nr_entries has to be separate from the timestamp. To prevent unneeded writes after the cache is allocated, should we check the nr_entries is set before writing zero? if (current->unwind_info.cache && current->unwind_info.cache->nr_entries) current->unwind_info.cache->nr_entries = 0; ?
I just made this into: if (current->unwind_info.cache) current->unwind_info.cache->nr_entries = 0; As later patches will add more here and I added a new patch that added a USED bit to the info->unwind_mask that gets set whenever the stack trace is used and this code needs to be executed. That makes it so that the unwind_mask is the only condition that needs to be checked when it was never used. -- Steve From: Steven Rostedt <rostedt@goodmis.org> Subject: [PATCH] unwind: Add USED bit to only have one conditional on way back to user space On the way back to user space, the function unwind_reset_info() is called unconditionally (but always inlined). It currently has two conditionals. One that checks the unwind_mask which is set whenever a deferred trace is called and is used to know that the mask needs to be cleared. The other checks if the cache has been allocated, and if so, it resets the nr_entries so that the unwinder knows it needs to do the work to get a new user space stack trace again (it only does it once per entering the kernel). Use one of the bits in the unwind mask as a "USED" bit that gets set whenever a trace is created. This will make it possible to only check the unwind_mask in the unwind_reset_info() to know if it needs to do work or not and eliminates a conditional that happens every time the task goes back to user space. Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> --- include/linux/unwind_deferred.h | 14 +++++++------- kernel/unwind/deferred.c | 5 ++++- 2 files changed, 11 insertions(+), 8 deletions(-)
diff --git a/include/linux/unwind_deferred.h b/include/linux/unwind_deferred.h
index e7bf133c5a20..4786acc0f087 100644
--- a/include/linux/unwind_deferred.h
+++ b/include/linux/unwind_deferred.h@@ -21,6 +21,10 @@ struct unwind_work { #define UNWIND_PENDING_BIT (BITS_PER_LONG - 1) #define UNWIND_PENDING BIT(UNWIND_PENDING_BIT) +/* Set if the unwinding was used (directly or deferred) */ +#define UNWIND_USED_BIT (UNWIND_PENDING_BIT - 1) +#define UNWIND_USED BIT(UNWIND_USED_BIT) + enum { UNWIND_ALREADY_PENDING = 1, UNWIND_ALREADY_EXECUTED = 2,
@@ -51,14 +55,10 @@ static __always_inline void unwind_reset_info(void) return; } while (!try_cmpxchg(&info->unwind_mask, &bits, 0UL)); local64_set(¤t->unwind_info.timestamp, 0); + + if (unlikely(info->cache)) + info->cache->nr_entries = 0; } - /* - * As unwind_user_faultable() can be called directly and - * depends on nr_entries being cleared on exit to user, - * this needs to be a separate conditional. - */ - if (unlikely(info->cache)) - info->cache->nr_entries = 0; } #else /* !CONFIG_UNWIND_USER */
diff --git a/kernel/unwind/deferred.c b/kernel/unwind/deferred.c
index c783d273a2dc..9ec1e74c6469 100644
--- a/kernel/unwind/deferred.c
+++ b/kernel/unwind/deferred.c@@ -131,6 +131,9 @@ int unwind_user_faultable(struct unwind_stacktrace *trace) cache->nr_entries = trace->nr; + /* Clear nr_entries on way back to user space */ + set_bit(UNWIND_USED_BIT, &info->unwind_mask); + return 0; }
@@ -325,7 +328,7 @@ int unwind_deferred_init(struct unwind_work *work, unwind_callback_t func) guard(mutex)(&callback_mutex); /* See if there's a bit in the mask available */ - if (unwind_mask == ~(UNWIND_PENDING)) + if (unwind_mask == ~(UNWIND_PENDING|UNWIND_USED)) return -EBUSY; work->bit = ffz(unwind_mask);
--
2.47.2