Thread (20 messages) 20 messages, 6 authors, 2021-06-29

Re: [RFC PATCH v5 1/2] arm64: Introduce stack trace reliability checks in the unwinder

From: Madhavan T. Venkataraman <hidden>
Date: 2021-06-25 17:18:44
Also in: live-patching, lkml


On 6/25/21 12:05 PM, Madhavan T. Venkataraman wrote:

On 6/25/21 10:51 AM, Mark Brown wrote:
quoted
On Fri, Jun 25, 2021 at 10:39:57AM -0500, Madhavan T. Venkataraman wrote:
quoted
On 6/24/21 9:40 AM, Mark Rutland wrote:
quoted
quoted
At a high-level, I'm on-board with keeping track of this per unwind
step, but if we do that then I want to be abel to use this during
regular unwinds (e.g. so that we can have a backtrace idicate when a
step is not reliable, like x86 does with '?'), and to do that we need to
be a little more accurate.
quoted
The only consumer of frame->reliable is livepatch. So, in retrospect, my
original per-frame reliability flag was an overkill. I was just trying to
provide extra per-frame debug information which is not really a requirement
for livepatch.
It's not a requirement for livepatch but if it's there a per frame
reliability flag would have other uses - for example Mark has mentioned
the way x86 prints a ? next to unreliable entries in oops output for
example, that'd be handy for people debugging issues and would have the
added bonus of ensuring that there's more constant and widespread
exercising of the reliability stuff than if it's just used for livepatch
which is a bit niche.
I agree. That is why I introduced the per-frame flag.

So, let us try a different approach.

First, let us get rid of the frame->reliable flag from this patch series. That flag
can be implemented when all of the pieces are in place for per-frame debug and tracking.

For consumers such as livepatch that don't really care about per-frame stuff, let us
solve it more cleanly via the return value of unwind_frame().

Currently, the return value from unwind_frame() is a tri-state return value which is
somewhat confusing.

	0	means continue unwinding
	-error	means stop unwinding. However,
			-ENOENT means successful termination
			Other values mean an error has happened.

Instead, let unwind_frame() return one of 3 values:

enum {
	UNWIND_CONTINUE,
	UNWIND_CONTINUE_WITH_ERRORS,
	UNWIND_STOP,
};
Sorry. I need to add one more value to this. So, the enum will be:

enum {
	UNWIND_CONTINUE,
	UNWIND_CONTINUE_WITH_ERRORS,
	UNWIND_STOP,
	UNWIND_STOP_WITH_ERRORS,
};

UNWIND_CONTINUE (what used to be a return value of 0)
	Continue with the unwind.

UNWIND_CONTINUE_WITH_ERRORS (new return value)
	Errors encountered. But the errors are not fatal errors like stack corruption.

UNWIND_STOP (what used to be -ENOENT)
	Successful termination of unwind.

UNWIND_STOP_WITH_ERRORS (what used to be -EINVAL, etc)
	Unsuccessful termination.

Sorry I missed this the last time.

So, to reiterate:

All consumers will stop unwinding when they see UNWIND_STOP and UNWIND_STOP_WITH_ERRORS.

Debug type consumers can choose to continue when they see UNWIND_CONTINUE_WITH_ERRORS.

Livepatch type consumers will only continue on UNWIND_CONTINUE.

This way, my patch series does not have a dependency on the per-frame enhancements.

Thanks!

Madhavan


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help