Re: [PATCH v4 17/39] unwind_user/sframe: Add support for reading .sframe headers
From: Josh Poimboeuf <jpoimboe@kernel.org>
Date: 2025-02-04 18:26:14
Also in:
linux-perf-users, linux-toolchains, lkml
On Wed, Jan 29, 2025 at 04:02:34PM -0800, Andrii Nakryiko wrote:
On Tue, Jan 28, 2025 at 6:02 PM Josh Poimboeuf [off-list ref] wrote: I'm not sure about this chunked lookup approach for arbitrary user space applications. Those executable sections can be a) big and b) discontiguous. E.g., one of the production binaries I looked at. Here are its three main executable sections: ... [17] .bolt.org.text PROGBITS 000000000b00e640 0ae0d640 0000000011ad621c 0000000000000000 AX 0 0 64 ... [48] .text PROGBITS 000000001e600000 1ce00000 0000000000775dd8 0000000000000000 AX 0 0 2097152 [49] .text.cold PROGBITS 000000001ed75e00 1d575e00 00000000007d3271 0000000000000000 AX 0 0 64 ... Total text size is about 300MB:quoted
quoted
quoted
0x0000000000775dd8 + 0x00000000007d3271 + 0x0000000011ad621c312603237 Section #17 ends at:quoted
quoted
quoted
hex(0x0000000011ad621c + 0x000000000b00e640)'0x1cae485c' While .text starts at 000000001e600000, so we have a gap of ~28MB:quoted
quoted
quoted
0x000000001e600000 - 0x1cae485c28424100 So unless we do something more clever to support multiple discontiguous chunks, this seems like a bad fit for user space.
Nothing clever needed, we could just have multiple sframe sections, each one with a pointer to its text segment. That would also have the benefit of allowing the sframe data to be much more compact for the noncontiguous cases.
I think having all this just binary searchable is already a big win anyways and should be plenty fast, no?
Sframe is trying to compete with frame pointers which are MUCH faster. 3-4x faster in my testing, not including the page faults (which tend to only affect performance in the very beginning). -- Josh