Re: [PATCH bpf-next v4 08/11] libbpf: Update gen_loader to emit BTF_KIND_FUNC relocations
From: Alexei Starovoitov <hidden>
Date: 2021-09-21 19:04:54
Also in:
bpf
On Mon, Sep 20, 2021 at 9:50 PM Kumar Kartikeya Dwivedi [off-list ref] wrote:
On Tue, Sep 21, 2021 at 06:27:16AM IST, Alexei Starovoitov wrote:quoted
On Mon, Sep 20, 2021 at 7:15 AM Kumar Kartikeya Dwivedi [off-list ref] wrote:quoted
This change updates the BPF syscall loader to relocate BTF_KIND_FUNC relocations, with support for weak kfunc relocations. The next commit adds bpftool supports to set up the fd_array_sz parameter for light skeleton. A second map for keeping fds is used instead of adding fds to existing loader.map because of following reasons:but it complicates signing bpf progs a lot.Can you explain this in short? (Just want to understand why it would be problem).
The signing idea (and light skeleton too) rely on two matching blocks: signed map and signed prog that operates on this map. They have to match and be technically part of single logical signature that consists of two pieces. The second map doesn't quite fit this model. Especially since it's an empty map and it is there for temporary use during execution of the loader prog. That fd_array_sz value would somehow need to be part of the signature. Adding a 3rd non-generic component to a signature has consequences to the whole signing process. The loader prog could have created this temp map on its own without asking bpf_load_and_run() to do it and without exposing it into a signature. Anyway the signed bpf progs may get solved differently with the latest John proposal, but that's a different discussion. The light skeleton minimalizm is its main advantage. Keeping it two pieces: one map and one prog is its main selling point.
quoted
quoted
If reserving an area for map and BTF fds, we would waste the remaining of (MAX_USED_MAPS + MAX_KFUNC_DESCS) * sizeof(int), which in most cases will be unused by the program. Also, we must place some limit on the amount of map and BTF fds a program can possibly open.That is just (256 + 64)*4 bytes of data. Really not much. I wouldn't worry about reserving this space.Ok, I'll probably go with this now, I didn't realise a separate fd would be prohibitive for the signing case, so I thought it would nice to lift the limiation on number of map_fds by packing fd_array fds in another map.quoted
quoted
If setting gen->fd_array to first map_fd offset, and then just finding the offset relative to this (for later BTF fds), such that they can be packed without wasting space, we run the risk of unnecessarily running out of valid offset for emit_relo stage (for kfuncs), because gen map creation and relocation stages are separated by other steps that can add lots of data (including bpf_object__populate_internal_map). It is also prone to break silently if features are added between map and BTF fd emits that possibly add more data (just ~128KB to break BTF fd, since insn->off allows for INT16_MAX (32767) * 4 bytes).I don't follow this logic.quoted
Both of these issues are compounded by the fact that data map is shared by all programs, so it is easy to end up with invalid offset for BTF fd.I don't follow this either. There is only one map and one program. What sharing are you talking about?What I saw was that the sequence of calls is like this: bpf_gen__map_create add_data - from first emit we add map_fd, we also store gen->fd_array then libbpf would call bpf_object__populate_internal_map which calls bpf_gen__map_update_elem, which also does add_data (can be of arbitrary sizes). emit_relos happens relatively at the end. For each program in the object, this sequence can be repeated, such that the add_data that we do in emit_relos, relative offset from gen->fd_array offset can end up becoming big enough (as all programs in object add data to same map), while gen->fd_array comes from first map creation.
You've meant to use fd_array as a very very sparse array with giant gaps between valid map_fds and btf_fds. Now I see it :) Indeed in such a case there is a risk of running out of 16-bit in bpf_insn->off. Reserving (256 + 64)*4 in the beginning of the data map should solve it, right? The loader prog can create a 2nd auxiliary map on the fly, but it seems easier and simpler to just reserve this space in one and only map.