Re: [PATCH bpf-next v2] libbpf: ignore .eh_frame sections when parsing elf files
From: Yonghong Song <hidden>
Date: 2021-09-07 22:24:53
On 9/7/21 12:36 PM, Toke Høiland-Jørgensen wrote:
Yonghong Song [off-list ref] writes:quoted
On 9/2/21 3:08 PM, Toke Høiland-Jørgensen wrote:quoted
Yonghong Song [off-list ref] writes:quoted
On 9/2/21 12:32 PM, Alexei Starovoitov wrote:quoted
On Thu, Sep 2, 2021 at 10:08 AM Toke Høiland-Jørgensen [off-list ref] wrote:quoted
Yonghong Song [off-list ref] writes:quoted
On 8/31/21 3:28 AM, Toke Høiland-Jørgensen wrote:quoted
Andrii Nakryiko [off-list ref] writes:quoted
On Thu, Aug 26, 2021 at 5:10 AM Toke Høiland-Jørgensen [off-list ref] wrote:quoted
When .eh_frame and .rel.eh_frame sections are present in BPF object files, libbpf produces errors like this when loading the file: libbpf: elf: skipping unrecognized data section(32) .eh_frame libbpf: elf: skipping relo section(33) .rel.eh_frame for section(32) .eh_frame It is possible to get rid of the .eh_frame section by adding -fno-asynchronous-unwind-tables to the compilation, but we have seen multiple examples of these sections appearing in BPF files in the wild, most recently in samples/bpf, fixed by: 5a0ae9872d5c ("bpf, samples: Add -fno-/to BPF Clang invocation")quoted
quoted
quoted
quoted
quoted
While the errors are technically harmless, they look odd and confuse users.These warnings point out invalid set of compiler flags used for compiling BPF object files, though. Which is a good thing and should incentivize anyone getting those warnings to check and fix how they do BPF compilation. Those .eh_frame sections shouldn't be present in BPF object files at all, and that's what libbpf is trying to say.Apart from triggering that warning, what effect does this have, though? The programs seem to work just fine (as evidenced by the fact that samples/bpf has been built this way for years, for instance)... Also, how is a user supposed to go from that cryptic error message to figuring out that it has something to do with compiler flags?quoted
I don't know exactly in which situations that .eh_frame section is added, but looking at our selftests (and now samples/bpf as well), where we use -target bpf, we don't need -fno-asynchronous-unwind-tables at all.This seems to at least be compiler-dependent. We ran into this with bpftool as well (for the internal BPF programs it loads whenever it runs), which already had '-target bpf' in the Makefile. We're carrying an internal RHEL patch adding -fno-asynchronous-unwind-tables to the bpftool build to fix this...I haven't seen an instance of .eh_frame as well with -target bpf. Do you have a reproducible test case? I would like to investigate what is the possible cause and whether we could do something in llvm to prevent its generatin. Thanks!We found this in the RHEL builds of bpftool. I don't think we're doing anything special, other than maybe building with a clang version that's a few versions behind: # clang --version clang version 11.0.0 (Red Hat 11.0.0-1.module+el8.4.0+8598+a071fcd5) Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /usr/bin So I suppose it may resolve itself once we upgrade LLVM?That's odd. I don't think I've seen this issue even with clang 11 (but I built it myself).I cannot reproduce it by self with self built llvm (11, 12, 13, 14). But I can reproduce it with an upstream built llvm12. /bin/clang \ -I. \ -I/home/yhs/work/bpf-next/tools/include/uapi/ \ -I/home/yhs/work/bpf-next/tools/lib/bpf/ \ -I/home/yhs/work/bpf-next/tools/lib \ -g -O2 -Wall -target bpf -c skeleton/pid_iter.bpf.c -o pid_iter.bpf.o && llvm-strip -g pid_iter.bpf.o GEN pid_iter.skel.h libbpf: elf: skipping unrecognized data section(11) .eh_frame libbpf: elf: skipping relo section(12) .rel.eh_frame for section(11) .eh_frameAh, that's interesting!quoted
quoted
If there is a fix indeed let's backport it to llvm 11. The user experience matters. It could be llvm configuration too. I'm guessing some build flags might influence default settings for unwind tables. Yonghong, can we make bpf backend to ignore needsUnwindTableEntry ?Sure. I will try to get upstream build flags, reproduce and fix it in llvm.I did some investigation and this is due to centos private patch: https://git.centos.org/rpms/clang/blob/b99d8d4a38320329e10570f308c3e2d8cf295c78/f/SOURCES/0002-PATCH-clang-Make-funwind-tables-the-default-on-all-a.patch In upstream, the original llvm-project source is patched with several private patches before building the rpm. https://koji.mbox.centos.org/pkgs/packages/clang/12.0.1/1.module_el8.5.0+892+54d791e1/data/logs/x86_64/build.log The above private patch enables unwind-table (.eh_frame section) by default for ALL architectures and bpf is a victim of this.Ah, doh! I had no idea we were doing this :/quoted
I filed a redhat bugzilla bug to fix their private patch. https://bugzilla.redhat.com/show_bug.cgi?id=2002024 Hopefully future newer compiler build won't have this issue.Thank you for finding the root cause of this! I'll follow up internally and make sure we get this fixed...
Thanks! Hopefully this can be resolved soon.
-Toke