Re: [RFC PATCH v5 1/2] arm64: Introduce stack trace reliability checks in the unwinder
From: Madhavan T. Venkataraman <hidden>
Date: 2021-06-25 17:18:44
Also in:
live-patching, lkml
On 6/25/21 12:05 PM, Madhavan T. Venkataraman wrote:
On 6/25/21 10:51 AM, Mark Brown wrote:quoted
On Fri, Jun 25, 2021 at 10:39:57AM -0500, Madhavan T. Venkataraman wrote:quoted
On 6/24/21 9:40 AM, Mark Rutland wrote:quoted
quoted
At a high-level, I'm on-board with keeping track of this per unwind step, but if we do that then I want to be abel to use this during regular unwinds (e.g. so that we can have a backtrace idicate when a step is not reliable, like x86 does with '?'), and to do that we need to be a little more accurate.quoted
The only consumer of frame->reliable is livepatch. So, in retrospect, my original per-frame reliability flag was an overkill. I was just trying to provide extra per-frame debug information which is not really a requirement for livepatch.It's not a requirement for livepatch but if it's there a per frame reliability flag would have other uses - for example Mark has mentioned the way x86 prints a ? next to unreliable entries in oops output for example, that'd be handy for people debugging issues and would have the added bonus of ensuring that there's more constant and widespread exercising of the reliability stuff than if it's just used for livepatch which is a bit niche.I agree. That is why I introduced the per-frame flag. So, let us try a different approach. First, let us get rid of the frame->reliable flag from this patch series. That flag can be implemented when all of the pieces are in place for per-frame debug and tracking. For consumers such as livepatch that don't really care about per-frame stuff, let us solve it more cleanly via the return value of unwind_frame(). Currently, the return value from unwind_frame() is a tri-state return value which is somewhat confusing. 0 means continue unwinding -error means stop unwinding. However, -ENOENT means successful termination Other values mean an error has happened. Instead, let unwind_frame() return one of 3 values: enum { UNWIND_CONTINUE, UNWIND_CONTINUE_WITH_ERRORS, UNWIND_STOP, };
Sorry. I need to add one more value to this. So, the enum will be:
enum {
UNWIND_CONTINUE,
UNWIND_CONTINUE_WITH_ERRORS,
UNWIND_STOP,
UNWIND_STOP_WITH_ERRORS,
};
UNWIND_CONTINUE (what used to be a return value of 0)
Continue with the unwind.
UNWIND_CONTINUE_WITH_ERRORS (new return value)
Errors encountered. But the errors are not fatal errors like stack corruption.
UNWIND_STOP (what used to be -ENOENT)
Successful termination of unwind.
UNWIND_STOP_WITH_ERRORS (what used to be -EINVAL, etc)
Unsuccessful termination.
Sorry I missed this the last time.
So, to reiterate:
All consumers will stop unwinding when they see UNWIND_STOP and UNWIND_STOP_WITH_ERRORS.
Debug type consumers can choose to continue when they see UNWIND_CONTINUE_WITH_ERRORS.
Livepatch type consumers will only continue on UNWIND_CONTINUE.
This way, my patch series does not have a dependency on the per-frame enhancements.
Thanks!
Madhavan
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel