Re: [PATCH v4 1/4] x86/ibt: factor out cfi and fineibt offset
From: Alexei Starovoitov <hidden>
Date: 2025-03-06 03:39:33
Also in:
bpf, linux-arm-kernel, lkml, llvm, netdev
From: Alexei Starovoitov <hidden>
Date: 2025-03-06 03:39:33
Also in:
bpf, linux-arm-kernel, lkml, llvm, netdev
On Wed, Mar 5, 2025 at 6:59 PM Menglong Dong [off-list ref] wrote:
I'm not sure if it works. However, indirect call is also used in function graph, so we still have better performance. Isn't it? Let me have a look at the code of the function graph first :/
Menglong, Function graph infra isn't going to help. "call foo" isn't a problem either. But we have to step back. per-function metadata is an optimization and feels like we're doing a premature optimization here without collecting performance numbers first. Let's implement multi-fentry with generic get_metadata_by_ip() first. get_metadata_by_ip() will be a hashtable in such a case and then we can compare its performance when it's implemented as a direct lookup from ip-4 (this patch) vs hash table (that does 'ip' to 'metadata' lookup). If/when we decide to do this per-function metadata we can also punt to generic hashtable for cfi, IBT, FineIBT, etc configs. When mitigations are enabled the performance suffers anyway, so hashtable lookup vs direct ip-4 lookup won't make much difference. So we can enable per-function metadata only on non-mitigation configs when FUNCTION_ALIGNMENT=16. There will be some number of bytes available before every function and if we can tell gcc/llvm to leave at least 5 bytes there the growth of vmlinux .text will be within a noise. So let's figure out the design of multi-fenty first with a hashtable for metadata and decide next steps afterwards.