Thread (52 messages) 52 messages, 5 authors, 2021-02-05

Re: [RFC PATCH 0/3] arm64: Implement reliable stack trace

From: Mark Rutland <mark.rutland@arm.com>
Date: 2021-01-27 16:42:35

On Wed, Jan 27, 2021 at 08:02:41AM -0600, Madhavan T. Venkataraman wrote:

On 10/12/20 12:26 PM, Mark Brown wrote:
quoted
This patch series aims to implement reliable stacktrace for arm64. 
Reliable stacktrace exists mainly to support live patching, it provides
a version of stacktrace that checks for consistency problems in the
traces it generates and provides an error code to callers indicating if
any problems were detected.      

This is a first cut of support for arm64, I've not really even started
testing it meaningfully at this point.  The main thing I'm looking for
here is that I'm not sure if there are any more potential indicators of
unrelabile stacks that I'm missing tests for or anything about the
interfaces that I've misunderstood.

There's more work that can be done here, mainly that we could sync our
unwinder more with what's done on S/390 and x86 which should if nothing
else help with keeping up to date with generic changes, but this should 
be what's needed to allow reliable stack trace.

Mark Brown (2):
  arm64: stacktrace: Report when we reach the end of the stack
  arm64: stacktrace: Implement reliable stacktrace

Mark Rutland (1):
  arm64: remove EL0 exception frame record

 arch/arm64/Kconfig             |  1 +
 arch/arm64/kernel/entry.S      | 10 +++----
 arch/arm64/kernel/stacktrace.c | 55 ++++++++++++++++++++++++++++------
 3 files changed, 52 insertions(+), 14 deletions(-)
This is mostly a question to improve my understanding of the current ARM64
unwinder.

Currently, ARM64 defines different stack types - task stack, IRQ stack, etc.
When it unwinds, it appears to unwind only the currently active stack.
The current (unreliable) unwinder will unwind across stack changes. That
detects stack transiations and will happily unwind across multiple
stacks so long as these do not loop.

However, where a backtrace crosses an exception boundary, there are
cases where this could in theory omit an entry from the backtrace
because. The LR and FP are only guaranteed to be in a consistent state
at function call boudaries, and since exceptions can be taken in the
middle of functions (or trampolines which transiently place these in an
inconsistent state) we cannot reliably backtrace across exception
boundaries (which may or may not involve a change of stack), unless we
had additional metadata and/or guarantees from compilers on how these
are manipulated.

Where we change stack without an exception boundary, we can reliably
unwind.
Specifically, if an interrupt has happened and the IRQ stack is the one that
is active, only the IRQ stack is unwound. The task stack is not. Is this
accurate?
The existing (unreliable) unwinder will unwind this case. The last frame
record on the IRQ stack will point to a frame record on the task stack,
and the unwinder will determine this can be safely accessed via the
on_accessible_stack() check. It will subsequently reject any frame
records on the IRQ stack (i.e. loops).
My question is - for live patching, we would need to look at the task stack
as well, right?
Ideally, we would be able to do this, but currently we cannot safely do
so. IIUC this means that live patching is still possible, but is
potentially much slower to apply updates.
May be, we need to pass a flag to the unwinder to check the
task stack in addition to the active task?
The logic to unwind across stack and exception boundaries already
exists, but to make this reliable we will need more invasive work,
potentially changing trampolines and/or adding metadata for these,
perhaps requiring objtool and/or toolchain changes.

Thanks,
Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help