Re: [PATCH v1] powerpc: Include running function as first entry in save_stack_trace() and friends
From: Mark Rutland <mark.rutland@arm.com>
Date: 2021-03-09 16:06:32
Also in:
linuxppc-dev, lkml
On Thu, Mar 04, 2021 at 03:54:48PM -0600, Segher Boessenkool wrote:
Hi!
Hi Segher,
On Thu, Mar 04, 2021 at 02:57:30PM +0000, Mark Rutland wrote:quoted
It looks like GCC is happy to give us the function-entry-time FP if we use __builtin_frame_address(1),From the GCC manual: Calling this function with a nonzero argument can have unpredictable effects, including crashing the calling program. As a result, calls that are considered unsafe are diagnosed when the '-Wframe-address' option is in effect. Such calls should only be made in debugging situations. It *does* warn (the warning is in -Wall btw), on both powerpc and aarch64. Furthermore, using this builtin causes lousy code (it forces the use of a frame pointer, which we normally try very hard to optimise away, for good reason). And, that warning is not an idle warning. Non-zero arguments to __builtin_frame_address can crash the program. It won't on simpler functions, but there is no real definition of what a simpler function *is*. It is meant for debugging, not for production use (this is also why no one has bothered to make it faster). On Power it should work, but on pretty much any other arch it won't.
I understand this is true generally, and cannot be relied upon in portable code. However as you hint here for Power, I believe that on arm64 __builtin_frame_address(1) shouldn't crash the program due to the way frame records work on arm64, but I'll go check with some local compiler folk. I agree that __builtin_frame_address(2) and beyond certainly can, e.g. by NULL dereference and similar. For context, why do you think this would work on power specifically? I wonder if our rationale is similar. Are you aware of anything in particular that breaks using __builtin_frame_address(1) in non-portable code, or is this just a general sentiment of this not being a supported use-case?
quoted
Unless we can get some strong guarantees from compiler folk such that we can guarantee a specific function acts boundary for unwinding (and doesn't itself get split, etc), the only reliable way I can think to solve this requires an assembly trampoline. Whatever we do is liable to need some invasive rework.You cannot get such a guarantee, other than not letting the compiler see into the routine at all, like with assembler code (not inline asm, real assembler code).
If we cannot reliably ensure this then I'm happy to go write an assembly trampoline to snapshot the state at a function call boundary (where our procedure call standard mandates the state of the LR, FP, and frame records pointed to by the FP). This'll require reworking a reasonable amount of code cross-architecture, so I'll need to get some more concrete justification (e.g. examples of things that can go wrong in practice).
The real way forward is to bite the bullet and to no longer pretend you can do a full backtrace from just the stack contents. You cannot.
I think what you mean here is that there's no reliable way to handle the current/leaf function, right? If so I do agree. Beyond that I believe that arm64's frame records should be sufficient. Thanks, Mark. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel