Re: [PATCH v4 17/39] unwind_user/sframe: Add support for reading .sframe headers
From: Indu Bhagat <hidden>
Date: 2025-02-06 01:11:18
Also in:
linux-perf-users, linux-toolchains, lkml
On 2/4/25 4:57 PM, Josh Poimboeuf wrote:
On Thu, Jan 30, 2025 at 01:39:52PM -0800, Indu Bhagat wrote:quoted
On 1/28/25 6:02 PM, Josh Poimboeuf wrote:quoted
However, if we're going that route, we might want to even consider a completely revamped data layout. For example: One insight is that the vast majority of (cfa, fp, ra) tuples aren't unique. They could be deduped by storing the unique tuples in a standalone 'fre_data' array which is referenced by another address-specific array. struct fre_data { s8|s16|s32 cfa, fp, ra; u8 info; }; struct fre_data fre_data[num_fre_data];We had the same observation at the time of SFrame V1. And this method of compaction (deduped tuples) was brain-stormed a bit. Back then, the costs were thought to be: - more work at build time. - an additional data access once the FRE is found (as there is indirection). So it was really compaction at the costs above. We did steer towards simplicity and the SFrame FRE is what it stands today. The difference in the pros and cons now from then: - pros: helps mitigate unaligned accesses - cons: interferes slightly with the design goal of efficient addition and removal of stack trace information per function for JIT. Think "removal" as the set of actions necessary for addressing fragmentation in SFrame section data in JIT usecase.If fre_data[] is allowed to have duplicates then the deduping could be optional.quoted
quoted
Note FDEs aren't even needed here as the unwinder doesn't need to know when a function begins/ends. The only info needed by the unwinder is just the fre_data struct. So a simple binary search of fres[] is all that's really needed.Splitting out information (start_address) to an FDE (as done in V1/V2) has the benefit that a job like relocating information is proportional to O(NumFunctions). In the case above, IIUC, where the proposal puts start_address in the FRE, these costs will be (much) higher.I'm not sure I follow, is this referring to the link-time work of sorting things?
I meant the work of tracking the start address of each function. This could be done at link-time as is done in most cases. But also depending on the case : e.g., kernel module loader will need to apply these relocations in the .rela.sframe section... If the granularity is finer than a function, more number of relocations will need to be applied.
quoted
In addition, not being able to identify stack trace information per function will affect the JIT usecase. We need to able to mark stack trace information stale for functions in JIT environment.Maybe, though it's hard to really say how any of these changes would affect JIT without knowing what those interfaces are going to look like.