Re: [PATCH bpf-next 08/11] libbpf: support BTF dedup of split BTFs
From: Song Liu <hidden>
Date: 2020-11-03 05:59:35
Also in:
bpf
On Nov 2, 2020, at 9:25 PM, Andrii Nakryiko [off-list ref] wrote: On Mon, Nov 2, 2020 at 6:49 PM Song Liu [off-list ref] wrote:quoted
quoted
On Oct 28, 2020, at 5:58 PM, Andrii Nakryiko [off-list ref] wrote: Add support for deduplication split BTFs. When deduplicating split BTF, base BTF is considered to be immutable and can't be modified or adjusted. 99% of BTF deduplication logic is left intact (module some type numbering adjustments). There are only two differences. First, each type in base BTF gets hashed (expect VAR and DATASEC, of course, those are always considered to be self-canonical instances) and added into a table of canonical table candidates. Hashing is a shallow, fast operation, so mostly eliminates the overhead of having entire base BTF to be a part of BTF dedup. Second difference is very critical and subtle. While deduplicating split BTF types, it is possible to discover that one of immutable base BTF BTF_KIND_FWD types can and should be resolved to a full STRUCT/UNION type from the split BTF part. This is, obviously, can't happen because we can't modify the base BTF types anymore. So because of that, any type in split BTF that directly or indirectly references that newly-to-be-resolved FWD type can't be considered to be equivalent to the corresponding canonical types in base BTF, because that would result in a loss of type resolution information. So in such case, split BTF types will be deduplicated separately and will cause some duplication of type information, which is unavoidable. With those two changes, the rest of the algorithm manages to deduplicate split BTF correctly, pointing all the duplicates to their canonical counter-parts in base BTF, but also is deduplicating whatever unique types are present in split BTF on their own. Also, theoretically, split BTF after deduplication could end up with either empty type section or empty string section. This is handled by libbpf correctly in one of previous patches in the series. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>Acked-by: Song Liu <redacted> With some nits:quoted
---[...]quoted
/* remap string offsets */ err = btf_for_each_str_off(d, strs_dedup_remap_str_off, d);@@ -3553,6 +3582,63 @@ static bool btf_compat_fnproto(struct btf_type *t1, struct btf_type *t2) return true;}An overview comment about bpf_deup_prep() will be great.okquoted
quoted
+static int btf_dedup_prep(struct btf_dedup *d) +{ + struct btf_type *t; + int type_id; + long h; + + if (!d->btf->base_btf) + return 0; + + for (type_id = 1; type_id < d->btf->start_id; type_id++) + {Move "{" to previous line?yep, my badquoted
quoted
+ t = btf_type_by_id(d->btf, type_id); + + /* all base BTF types are self-canonical by definition */ + d->map[type_id] = type_id; + + switch (btf_kind(t)) { + case BTF_KIND_VAR: + case BTF_KIND_DATASEC: + /* VAR and DATASEC are never hash/deduplicated */ + continue;[...]quoted
/* we are going to reuse hypot_map to store compaction remapping */ d->hypot_map[0] = 0; - for (i = 1; i <= d->btf->nr_types; i++) - d->hypot_map[i] = BTF_UNPROCESSED_ID; + /* base BTF types are not renumbered */ + for (id = 1; id < d->btf->start_id; id++) + d->hypot_map[id] = id; + for (i = 0, id = d->btf->start_id; i < d->btf->nr_types; i++, id++) + d->hypot_map[id] = BTF_UNPROCESSED_ID;We don't really need i in the loop, shall we just do for (id = d->btf->start_id; id < d->btf->start_id + d->btf->nr_types; id++) ?I prefer the loop with i iterating over the count of types, it seems more "obviously correct". For simple loop like this I could do for (i = 0; i < d->btf->nr_types; i++) d->hypot_map[d->start_id + i] = ...; But for the more complicated one below I found that maintaining id as part of the for loop control block is a bit cleaner. So I just stuck to the consistent pattern across all of them.
How about
for (i = 0; i < d->btf->nr_types; i++) {
id = d->start_id + i;
...
?
I would expect for loop with two loop variable to do some tricks, like two
termination conditions, or another conditional id++ somewhere in the loop.
Thanks,
Song