Re: [PATCH bpf-next 0/2] libbpf: Implement BTF Generator API

From: Rafael David Tinoco <hidden>
Date: 2021-10-29 05:42:11
Also in: netdev

On Thu, Oct 28, 2021 at 11:34 PM Alexei Starovoitov
[off-list ref] wrote:

On Wed, Oct 27, 2021 at 1:37 PM Mauricio Vásquez [off-list ref] wrote:

quoted

There is also a good example[3] on how to use BTFGen and BTFHub together
to generate multiple BTF files, to each existing/supported kernel,
tailored to one application. For example: a complex bpf object might
support nearly 400 kernels by having BTF files summing only 1.5 MB.

Could you share more details on what kind of fields and types
were used to achieve this compression?
Tracing progs will be peeking into task_struct.
To describe it in the reduced BTF most of the kernel types would be needed,
so I'm a bit skeptical on the practicality of the algorithm.

https://github.com/aquasecurity/btfhub/tree/main/tools

has a complete README and, at the end, the example used:

https://github.com/aquasecurity/btfhub/tree/main/tools#time-to-test-btfgen-and-btfhub

We tested btfgen with bpfcc tools and tracee:

https://github.com/aquasecurity/tracee/blob/main/tracee-ebpf/tracee/tracee.bpf.c

and the generated BTF files worked. If you run something like:

./btfgen.sh [.../aquasec-tracee/tracee-ebpf/dist/tracee.bpf.core.o]

it will generate the BTFs tailored to a given eBPF object file, 1 smaller BTF
file per existing full external raw BTF file (1 per kernel version, basically).

All the ~500 kernels generated the same amount of BTF files with ~3MB in
total. We then remove all the BTF files that are equal to their previous
kernels:

https://github.com/aquasecurity/btfhub/blob/main/tools/btfgen.sh#L113

and we are able to reduce from 3MB to 1.5MB (as similar BTF files are symlinks
to the previous ones).

I think it may work for sk_buff, since it will pull struct sock,
net_device, rb_tree, and not a ton more.
Have you considered generating kernel BTF with fields that are accessed
by bpf prog only and replacing all other fields with padding ?

That is exactly the result of our final BTF file. We only include the
fields and types being used by the given eBPF object:

$ bpftool btf dump file ./generated/5.4.0-87-generic.btf format raw
[1] PTR '(anon)' type_id=99
[2] TYPEDEF 'u32' type_id=35
[3] TYPEDEF '__be16' type_id=22
[4] PTR '(anon)' type_id=52
[5] TYPEDEF '__u8' type_id=83
[6] PTR '(anon)' type_id=29
[7] STRUCT 'mnt_namespace' size=120 vlen=1
    'ns' type_id=72 bits_offset=64
[8] TYPEDEF '__kernel_gid32_t' type_id=75
[9] STRUCT 'iovec' size=16 vlen=2
    'iov_base' type_id=16 bits_offset=0
    'iov_len' type_id=85 bits_offset=64
[10] PTR '(anon)' type_id=58
[11] STRUCT '(anon)' size=8 vlen=2
    'skc_daddr' type_id=81 bits_offset=0
    'skc_rcv_saddr' type_id=81 bits_offset=32
[12] TYPEDEF '__u64' type_id=89
...
[120] STRUCT 'task_struct' size=9216 vlen=13
    'thread_info' type_id=105 bits_offset=0
    'real_parent' type_id=30 bits_offset=18048
    'real_cred' type_id=16 bits_offset=21248
    'pid' type_id=14 bits_offset=17920
    'mm' type_id=110 bits_offset=16512
    'thread_pid' type_id=56 bits_offset=18752
    'exit_code' type_id=123 bits_offset=17152
    'group_leader' type_id=30 bits_offset=18432
    'flags' type_id=75 bits_offset=288
    'thread_group' type_id=87 bits_offset=19328
    'tgid' type_id=14 bits_offset=17952
    'nsproxy' type_id=100 bits_offset=22080
    'comm' type_id=96 bits_offset=21440
[121] STRUCT 'pid' size=96 vlen=1
    'numbers' type_id=77 bits_offset=640
[122] STRUCT 'new_utsname' size=390 vlen=1
    'nodename' type_id=48 bits_offset=520
[123] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
[124] ARRAY '(anon)' type_id=35 index_type_id=123 nr_elems=2

If you do a "format c" to the generated BTF file, then bpftool considers
everything as padding:

typedef unsigned char __u8;

struct ns_common {
        long: 64;
        long: 64;
        unsigned int inum;
        int: 32;
};

struct mnt_namespace {
        long: 64;
        struct ns_common ns;
        long: 64;
        long: 64;
        long: 64;
        long: 64;
        long: 64;
        long: 64;
        long: 64;
        long: 64;
        long: 64;
        long: 64;
        long: 64;
};

typedef unsigned int __kernel_gid32_t;

But libbpf is still able to calculate all field relocations.

I think the algo would be quite different from the actual CO-RE logic
you're trying to reuse.
If CO-RE matching style is necessary and it's the best approach then please
add new logic to bpftool.

Yes, we're heading that direction I suppose :\ ... Accessing .BTF.ext, to get
the "bpf_core_relo" information requires libbpf internals to be exposed:

https://github.com/rafaeldtinoco/btfgen/blob/standalone/btfgen2.c#L119
https://github.com/rafaeldtinoco/btfgen/blob/standalone/include/stolen.h

we would have to check if we can try to export what we need for that (instead
of re-declaring internal headers, which obviously looks bad).

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help