Re: [PATCH 17/39] bpf: resolve_pseudo_ldimm64(): take handling of a single ldimm64 insn into helper
From: Alexei Starovoitov <hidden>
Date: 2024-08-09 01:23:15
Also in:
bpf, cgroups, kvm, linux-fsdevel
On Thu, Aug 8, 2024 at 1:35 PM Andrii Nakryiko [off-list ref] wrote:
On Thu, Aug 8, 2024 at 9:51 AM Alexei Starovoitov [off-list ref] wrote:quoted
On Wed, Aug 7, 2024 at 8:31 AM Andrii Nakryiko [off-list ref] wrote:quoted
On Wed, Aug 7, 2024 at 3:30 AM Christian Brauner [off-list ref] wrote:quoted
On Tue, Aug 06, 2024 at 03:32:20PM GMT, Andrii Nakryiko wrote:quoted
On Mon, Jul 29, 2024 at 10:20 PM [off-list ref] wrote:quoted
From: Al Viro <viro@zeniv.linux.org.uk> Equivalent transformation. For one thing, it's easier to follow that way. For another, that simplifies the control flow in the vicinity of struct fd handling in there, which will allow a switch to CLASS(fd) and make the thing much easier to verify wrt leaks. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> --- kernel/bpf/verifier.c | 342 +++++++++++++++++++++--------------------- 1 file changed, 172 insertions(+), 170 deletions(-)This looks unnecessarily intrusive. I think it's best to extract the logic of fetching and adding bpf_map by fd into a helper and that way contain fdget + fdput logic nicely. Something like below, which I can send to bpf-next. commit b5eec08241cc0263e560551de91eda73ccc5987d Author: Andrii Nakryiko [off-list ref] Date: Tue Aug 6 14:31:34 2024 -0700 bpf: factor out fetching bpf_map from FD and adding it to used_maps list Factor out the logic to extract bpf_map instances from FD embedded in bpf_insns, adding it to the list of used_maps (unless it's already there, in which case we just reuse map's index). This simplifies the logic in resolve_pseudo_ldimm64(), especially around `struct fd` handling, as all that is now neatly contained in the helper and doesn't leak into a dozen error handling paths. Signed-off-by: Andrii Nakryiko [off-list ref]diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index df3be12096cf..14e4ef687a59 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c@@ -18865,6 +18865,58 @@ static bool bpf_map_is_cgroup_storage(structbpf_map *map) map->map_type == BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE); } +/* Add map behind fd to used maps list, if it's not already there, and return + * its index. Also set *reused to true if this map was already in the list of + * used maps. + * Returns <0 on error, or >= 0 index, on success. + */ +static int add_used_map_from_fd(struct bpf_verifier_env *env, int fd, bool *reused) +{ + struct fd f = fdget(fd);Use CLASS(fd, f)(fd) and you can avoid all that fdput() stuff.That was the point of Al's next patch in the series, so I didn't want to do it in this one that just refactored the logic of adding maps. But I can fold that in and send it to bpf-next.+1. The bpf changes look ok and Andrii's approach is easier to grasp. It's better to route bpf conversion to CLASS(fd,..) via bpf-next, so it goes through bpf CI and our other testing. bpf patches don't seem to depend on newly added CLASS(fd_pos, ... and fderr, so pretty much independent from other patches.Ok, so CLASS(fd, f) won't work just yet because of peculiar __bpf_map_get() contract: if it gets valid struct fd but it doesn't contain a valid struct bpf_map, then __bpf_map_get() does fdput() internally. In all other cases the caller has to do fdput() and returned struct bpf_map's refcount has to be bumped by the caller (__bpf_map_get() doesn't do that, I guess that's why it's double-underscored). I think the reason it was done was just a convenience to not have to get/put bpf_map for temporary uses (and instead rely on file's reference keeping bpf_map alive), plus we have bpf_map_inc() and bpf_map_inc_uref() variants, so in some cases we need to bump just refcount, and in some both user and normal refcounts. So can't use CLASS(fd, ...) without some more clean up. Alexei, how about changing __bpf_map_get(struct fd f) to __bpf_map_get_from_fd(int ufd), doing fdget/fdput internally, and always returning bpf_map with (normal) refcount bumped (if successful, of course). We can then split bpf_map_inc_with_uref() into just bpf_map_inc() and bpf_map_inc_uref(), and callers will be able to do extra uref-only increment, if necessary. I can do that as a pre-patch, there are about 15 callers, so not too much work to clean this up. Let me know.
Yeah. Let's kill __bpf_map_get(struct fd ..) altogether. This logic was added in 2014. fdget() had to be first and fdput() last to make sure the map won't disappear while sys_bpf command is running. All of the places can use bpf_map_get(), bpf_map_put() pair and rely on map->refcnt, but... - it's atomic64_inc(&map->refcnt); The cost is probably in the noise compared to all the work that map sys_bpf commands do. - It also opens new fuzzing opportunity to do some map operation in one thread and close(map_fd) in the other, so map->usercnt can drop to zero and map_release_uref() cleanup can start while the other thread is still busy doing something like map_update_elem(). It can be mitigated by doing bpf_map_get_with_uref(), but two atomic64_inc() is kinda too much. So let's remove __bpf_map_get() and replace all users with bpf_map_get(), but we may need to revisit that later.