Re: [PATCH bpf-next 6/8] libbpf: wire up BPF token support at BPF object level

From: Andrii Nakryiko <hidden>
Date: 2023-12-12 00:05:39
Also in: bpf, linux-fsdevel, linux-security-module

On Mon, Dec 11, 2023 at 2:56 PM John Fastabend [off-list ref] wrote:

Andrii Nakryiko wrote:

quoted

Add BPF token support to BPF object-level functionality.

BPF token is supported by BPF object logic either as an explicitly
provided BPF token from outside (through BPF FS path or explicit BPF
token FD), or implicitly (unless prevented through
bpf_object_open_opts).

Implicit mode is assumed to be the most common one for user namespaced
unprivileged workloads. The assumption is that privileged container
manager sets up default BPF FS mount point at /sys/fs/bpf with BPF token
delegation options (delegate_{cmds,maps,progs,attachs} mount options).
BPF object during loading will attempt to create BPF token from
/sys/fs/bpf location, and pass it for all relevant operations
(currently, map creation, BTF load, and program load).

In this implicit mode, if BPF token creation fails due to whatever
reason (BPF FS is not mounted, or kernel doesn't support BPF token,
etc), this is not considered an error. BPF object loading sequence will
proceed with no BPF token.

In explicit BPF token mode, user provides explicitly either custom BPF
FS mount point path or creates BPF token on their own and just passes
token FD directly. In such case, BPF object will either dup() token FD
(to not require caller to hold onto it for entire duration of BPF object
lifetime) or will attempt to create BPF token from provided BPF FS
location. If BPF token creation fails, that is considered a critical
error and BPF object load fails with an error.

Libbpf provides a way to disable implicit BPF token creation, if it
causes any troubles (BPF token is designed to be completely optional and
shouldn't cause any problems even if provided, but in the world of BPF
LSM, custom security logic can be installed that might change outcome
dependin on the presence of BPF token). To disable libbpf's default BPF
token creation behavior user should provide either invalid BPF token FD
(negative), or empty bpf_token_path option.

BPF token presence can influence libbpf's feature probing, so if BPF
object has associated BPF token, feature probing is instructed to use
BPF object-specific feature detection cache and token FD.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 tools/lib/bpf/btf.c             |   7 +-
 tools/lib/bpf/libbpf.c          | 120 ++++++++++++++++++++++++++++++--
 tools/lib/bpf/libbpf.h          |  28 +++++++-
 tools/lib/bpf/libbpf_internal.h |  17 ++++-
 4 files changed, 160 insertions(+), 12 deletions(-)

...

quoted

+static int bpf_object_prepare_token(struct bpf_object *obj)
+{
+     const char *bpffs_path;
+     int bpffs_fd = -1, token_fd, err;
+     bool mandatory;
+     enum libbpf_print_level level = LIBBPF_DEBUG;

redundant set on level?

yep, removed initialization

quoted

+
+     /* token is already set up */
+     if (obj->token_fd > 0)
+             return 0;
+     /* token is explicitly prevented */
+     if (obj->token_fd < 0) {
+             pr_debug("object '%s': token is prevented, skipping...\n", obj->name);
+             /* reset to zero to avoid extra checks during map_create and prog_load steps */
+             obj->token_fd = 0;
+             return 0;
+     }
+
+     mandatory = obj->token_path != NULL;
+     level = mandatory ? LIBBPF_WARN : LIBBPF_DEBUG;
+
+     bpffs_path = obj->token_path ?: BPF_FS_DEFAULT_PATH;
+     bpffs_fd = open(bpffs_path, O_DIRECTORY, O_RDWR);
+     if (bpffs_fd < 0) {
+             err = -errno;
+             __pr(level, "object '%s': failed (%d) to open BPF FS mount at '%s'%s\n",
+                  obj->name, err, bpffs_path,
+                  mandatory ? "" : ", skipping optional step...");
+             return mandatory ? err : 0;
+     }
+
+     token_fd = bpf_token_create(bpffs_fd, 0);

Did this get tested on older kernels? In that case TOKEN_CREATE will
fail with -EINVAL.

yep, I did actually test, it will generate expected *debug*-level
"failed to create BPF token" message

quoted

+     close(bpffs_fd);
+     if (token_fd < 0) {
+             if (!mandatory && token_fd == -ENOENT) {
+                     pr_debug("object '%s': BPF FS at '%s' doesn't have BPF token delegation set up, skipping...\n",
+                              obj->name, bpffs_path);
+                     return 0;
+             }

Isn't there a case here we should give a warning about?  If BPF_TOKEN_CREATE
exists and !mandatory, but default BPFFS failed for enomem, or eperm reasons?
If the user reall/y doesn't want tokens here they should maybe override with
-1 token? My thought is if you have delegations set up then something on the
system is trying to configure this and an error might be ok? I'm asking just
because I paused on it for a bit not sure either way at the moment. I might
imagine a lazy program not specifying the default bpffs, but also really
thinking its going to get a valid token.

Interesting perspective! I actually came from the direction that BPF
token is not really all that common and expected thing, and so in
majority of cases (at least for some time) we won't be expecting to
have BPF FS with delegation options. So emitting a warning that
"something something BPF token failed" would be disconcerting to most
users.

What's the worst that would happen if BPF token was expected but we
failed to instantiate it? You'll get a BPF object load failure with
-EPERM, so it will be a pretty clear signal that whatever delegation
was supposed to happen didn't happen.

Also, if a user wants a BPF token for sure, they can explicitly set
bpf_token_path = "/sys/fs/bpf" and then it becomes mandatory.

So tl;dr, my perspective is that most users won't know or care about
BPF tokens. If sysadmin set up BPF FS correctly, it should just work
without the BPF application being aware. But for those rare cases
where a BPF token is expected and necessary, explicit bpf_token_path
or bpf_token_fd is the way to fail early, if something is not set up
the way it is expected.

quoted

+             __pr(level, "object '%s': failed (%d) to create BPF token from '%s'%s\n",
+                  obj->name, token_fd, bpffs_path,
+                  mandatory ? "" : ", skipping optional step...");
+             return mandatory ? token_fd : 0;
+     }
+
+     obj->feat_cache = calloc(1, sizeof(*obj->feat_cache));
+     if (!obj->feat_cache) {
+             close(token_fd);
+             return -ENOMEM;
+     }
+
+     obj->token_fd = token_fd;
+     obj->feat_cache->token_fd = token_fd;
+
+     return 0;
+}
+
 static int
 bpf_object__probe_loading(struct bpf_object *obj)
 {

@@ -4601,6 +4664,7 @@ bpf_object__probe_loading(struct bpf_object *obj)
              BPF_EXIT_INSN(),
      };
      int ret, insn_cnt = ARRAY_SIZE(insns);
+     LIBBPF_OPTS(bpf_prog_load_opts, opts, .token_fd = obj->token_fd);

      if (obj->gen_loader)
              return 0;
@@ -4610,9 +4674,9 @@ bpf_object__probe_loading(struct bpf_object *obj)
              pr_warn("Failed to bump RLIMIT_MEMLOCK (err = %d), you might need to do it explicitly!\n", ret);

      /* make sure basic loading works */
-     ret = bpf_prog_load(BPF_PROG_TYPE_SOCKET_FILTER, NULL, "GPL", insns, insn_cnt, NULL);
+     ret = bpf_prog_load(BPF_PROG_TYPE_SOCKET_FILTER, NULL, "GPL", insns, insn_cnt, &opts);
      if (ret < 0)
-             ret = bpf_prog_load(BPF_PROG_TYPE_TRACEPOINT, NULL, "GPL", insns, insn_cnt, NULL);
+             ret = bpf_prog_load(BPF_PROG_TYPE_TRACEPOINT, NULL, "GPL", insns, insn_cnt, &opts);
      if (ret < 0) {
              ret = errno;
              cp = libbpf_strerror_r(ret, errmsg, sizeof(errmsg));
@@ -4635,6 +4699,9 @@ bool kernel_supports(const struct bpf_object *obj, enum kern_feature_id feat_id)
               */
              return true;

+     if (obj->token_fd)
+             return feat_supported(obj->feat_cache, feat_id);

OK that answers feat_supported() non null from earlier patch. Just
was reading in order.

yep, no worries, that's what I assumed :)

quoted

+
      return feat_supported(NULL, feat_id);
 }

...

quoted

      btf_fd = bpf_object__btf_fd(obj);

@@ -7050,10 +7119,10 @@ static int bpf_object_init_progs(struct bpf_object *obj, const struct bpf_object
 static struct bpf_object *bpf_object_open(const char *path, const void *obj_buf, size_t obj_buf_sz,
                                        const struct bpf_object_open_opts *opts)
 {
-     const char *obj_name, *kconfig, *btf_tmp_path;
+     const char *obj_name, *kconfig, *btf_tmp_path, *token_path;
      struct bpf_object *obj;
      char tmp_name[64];
-     int err;
+     int err, token_fd;
      char *log_buf;
      size_t log_size;
      __u32 log_level;
@@ -7087,6 +7156,20 @@ static struct bpf_object *bpf_object_open(const char *path, const void *obj_buf,
      if (log_size && !log_buf)
              return ERR_PTR(-EINVAL);

+     token_path = OPTS_GET(opts, bpf_token_path, NULL);
+     token_fd = OPTS_GET(opts, bpf_token_fd, -1);
+     /* non-empty token path can't be combined with invalid token FD */
+     if (token_path && token_path[0] != '\0' && token_fd < 0)
+             return ERR_PTR(-EINVAL);
+     if (token_path && token_path[0] == '\0') {
+             /* empty token path can't be combined with valid token FD */
+             if (token_fd > 0)
+                     return ERR_PTR(-EINVAL);
+             /* empty token_path is equivalent to invalid token_fd */
+             token_path = NULL;
+             token_fd = -1;
+     }
+
      obj = bpf_object__new(path, obj_buf, obj_buf_sz, obj_name);
      if (IS_ERR(obj))
              return obj;
@@ -7095,6 +7178,23 @@ static struct bpf_object *bpf_object_open(const char *path, const void *obj_buf,
      obj->log_size = log_size;
      obj->log_level = log_level;

+     obj->token_fd = token_fd <= 0 ? token_fd : dup_good_fd(token_fd);
+     if (token_fd > 0 && obj->token_fd < 0) {
+             err = -errno;
+             goto out;
+     }
+     if (token_path) {
+             if (strlen(token_path) >= PATH_MAX) {

small nit, might be cleaner to just have this up where the other sanity
checks are done? e.g.

   `token_path[0] !=` `\0` && token_path(token_path) < PATH_MAX`

just to abort earlier. But not sure I care much.

yep, makes sense, I'll move ENAMETOOLONG up

quoted

+                     err = -ENAMETOOLONG;
+                     goto out;
+             }
+             obj->token_path = strdup(token_path);
+             if (!obj->token_path) {
+                     err = -ENOMEM;
+                     goto out;
+             }
+     }
+
      btf_tmp_path = OPTS_GET(opts, btf_custom_path, NULL);
      if (btf_tmp_path) {
              if (strlen(btf_tmp_path) >= PATH_MAX) {

[...]

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help