Thread (27 messages) 27 messages, 3 authors, 2025-10-10

Re: [PATCH v7 00/23] mm/ksw: Introduce real-time KStackWatch debugging tool

From: Andrew Morton <akpm@linux-foundation.org>
Date: 2025-10-10 00:51:10
Also in: linux-doc, linux-mm, linux-perf-users, lkml, llvm, workflows

On Thu,  9 Oct 2025 18:55:36 +0800 Jinchao Wang [off-list ref] wrote:
This patch series introduces KStackWatch, a lightweight debugging tool to detect
kernel stack corruption in real time. It installs a hardware breakpoint
(watchpoint) at a function's specified offset using `kprobe.post_handler` and
removes it in `fprobe.exit_handler`. This covers the full execution window and
reports corruption immediately with time, location, and a call stack.

The motivation comes from scenarios where corruption occurs silently in one
function but manifests later in another, without a direct call trace linking
the two. Such bugs are often extremely hard to debug with existing tools.
These scenarios are demonstrated in test 3–5 (silent corruption test, patch 20).

...

 20 files changed, 1809 insertions(+), 62 deletions(-)
It's obviously a substantial project.  We need to decide whether to add
this to Linux.

There are some really important [0/N] changelog details which I'm not
immediately seeing:

Am I correct in thinking that it's x86-only?  If so, what's involved in
enabling other architectures?  Is there any such work in progress?

What motivated the work?  Was there some particular class of failures
which you were persistently seeing and wished to fix more efficiently?

Has this code (or something like it) been used in production systems? 
If so, by whom and with what results?

Has it actually found some kernel bugs yet?  If so, details please.

Can this be enabled on production systems?  If so, what is the
measured runtime overhead?
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help