Thread (18 messages) 18 messages, 3 authors, 13h ago

Re: [PATCH v3 4/4] panic: use sys_info_with_filter() to avoid duplicate backtraces

From: Feng Tang <hidden>
Date: 2026-06-29 11:41:04
Also in: lkml, stable

On Fri, Jun 26, 2026 at 02:14:14PM +0200, Petr Mladek wrote:
On Fri 2026-06-26 12:23:50, Petr Mladek wrote:
quoted
On Thu 2026-06-25 15:25:58, Bradley Morgan wrote:
quoted
panic_other_cpus_shutdown() handles SYS_INFO_ALL_BT before stopping the
other CPUs. Do not ask sys_info() to handle that bit again later in the
panic path.

Use sys_info_with_filter() so panic_print=all_bt does not request more
output after the CPUs are stopped.

Fixes: a9af76a78760 ("watchdog: add sys_info sysctls to dump sys info on system lockup")
Cc: stable@vger.kernel.org
Signed-off-by: Bradley Morgan <redacted>
---
 kernel/panic.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/panic.c b/kernel/panic.c
index 213725b612aa..eb842823df61 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -680,7 +680,7 @@ void vpanic(const char *fmt, va_list args)
 	 */
 	atomic_notifier_call_chain(&panic_notifier_list, 0, buf);
 
-	sys_info(panic_print);
+	sys_info_with_filter(panic_print, SYS_INFO_ALL_BT);
Hmm, this prevents printing backtraces from all CPUs completely.
But what if they were not printed?

They might be printed by:

static void panic_other_cpus_shutdown(bool crash_kexec)
{
	if (panic_print & SYS_INFO_ALL_BT)
		panic_trigger_all_cpu_backtrace();

[...]
}

But it checks only "panic_print" variable. It won't do anything
when (panic_print == 0).

In this case, we might still want to print the backraces when
SYS_INFO_ALL_BT is set in kernel_si_info.
quoted
 	kmsg_dump_desc(KMSG_DUMP_PANIC, buf);
Of course, we might fix panic_other_cpus_shutdown() to check also
kernel_si_info.

But it all becomes very hairy. We have several levels:

   + watchdog-all_bt-specific option, e.g. sysctl_hardlockup_all_cpu_backtrace

   + watchdog-specific si_info preferences, e.g. hardlockup_si_mask

   + panic-specific si_info: panic_print

   + universal fallback for any layer: kernel_si_info

Now, we try to check all these variables back and forth to
trigger all backtraces or to avoid triggering them.
And it clearly does not work well and the code is more and more
hairy.

I think about another approach. The word "waterfall" comes to my mind.
Instead of checking all the settings back and forth, let's process
each setting one by one and just remember what has been done and
skip this in the next level.

All the si_info actions seems to dump a global system state.
So, it would make sense to remember the state in a global variable
even when it might be modified by more CPUs in parallel.

I am going to think more about it.
I have created a POC using Gemini. I haven't tested it.
But it looks acceptable. And the logic seems to be more
straightforward.

One drawback is that it requires adding the _reset()
call for all sys_info() callers. It is fine in principle
but it might complicate back-porting because all changes
have to be done in one patch.

But honestly, this is a nice to have fix. Most people could
live happily without it.

From 3c66436d9978030845a96bfaedd6b914536e2ac4 Mon Sep 17 00:00:00 2001
From: Petr Mladek <pmladek@suse.com>
Date: Fri, 26 Jun 2026 13:55:41 +0200
Subject: [POC] sys_info: Introduce state-tracking APIs to prevent duplicate
 backtraces

In watchdog, panic, and hung task detection scenarios, sys_info() can
be called multiple times or alongside direct backtrace triggers like
trigger_allbutcpu_cpu_backtrace(). This results in identical backtraces
being dumped repeatedly from all CPUs, cluttering the kernel log and
delaying or obscuring critical debug details.

Introduce a state tracking bitmask and associated helpers:
- sys_info_done(mask): Marks specific sys_info bits as already printed.
- sys_info_reset(): Resets the tracking state.
- sys_info_is_done(mask): Checks if all bits in the mask have been printed.

Update sys_info() to automatically filter out already printed bits
using this state. Integrate these APIs with the generic hardlockup
and softlockup watchdogs, the PowerPC watchdog, the hung task detector,
and the panic core. This ensures that each piece of system information
and backtrace output is printed at most once per lockup/panic event,
and the state is reset cleanly when a lockup does not trigger a panic.

Races between sys_info() callers are ignored. It should be acceptable
because the output from various watchdogs has never been synchronized.
And panic() never returns.

Assisted-by: gemini-1.5-flash
Signed-off-by: Petr Mladek <pmladek@suse.com>
Yep. There are cases that people want panic on task-hung or sw/hw lockup,
and this could remove much duplication of sys info dump, thanks!

Reviewed-by: Feng Tang <redacted>
---
 arch/powerpc/kernel/watchdog.c | 13 ++++++++++---
 include/linux/sys_info.h       |  3 +++
 kernel/hung_task.c             |  2 ++
 kernel/panic.c                 |  4 +++-
 kernel/watchdog.c              | 10 ++++++++--
 lib/sys_info.c                 | 30 +++++++++++++++++++++++++++++-
 6 files changed, 55 insertions(+), 7 deletions(-)
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help