Thread (54 messages) 54 messages, 5 authors, 2022-05-31

Re: [PATCH bpf-next v7 05/11] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP

From: Andrii Nakryiko <hidden>
Date: 2022-05-25 04:39:42
Also in: bpf

On Tue, May 24, 2022 at 9:03 PM Stanislav Fomichev [off-list ref] wrote:
On Tue, May 24, 2022 at 4:45 PM Andrii Nakryiko
[off-list ref] wrote:
quoted
On Tue, May 24, 2022 at 10:50 AM Martin KaFai Lau [off-list ref] wrote:
quoted
On Tue, May 24, 2022 at 08:55:04AM -0700, Stanislav Fomichev wrote:
quoted
On Mon, May 23, 2022 at 8:49 PM Martin KaFai Lau [off-list ref] wrote:
quoted
On Wed, May 18, 2022 at 03:55:25PM -0700, Stanislav Fomichev wrote:
quoted
We have two options:
1. Treat all BPF_LSM_CGROUP the same, regardless of attach_btf_id
2. Treat BPF_LSM_CGROUP+attach_btf_id as a separate hook point

I was doing (2) in the original patch, but switching to (1) here:

* bpf_prog_query returns all attached BPF_LSM_CGROUP programs
regardless of attach_btf_id
* attach_btf_id is exported via bpf_prog_info

Signed-off-by: Stanislav Fomichev <redacted>
---
 include/uapi/linux/bpf.h |   5 ++
 kernel/bpf/cgroup.c      | 103 +++++++++++++++++++++++++++------------
 kernel/bpf/syscall.c     |   4 +-
 3 files changed, 81 insertions(+), 31 deletions(-)
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index b9d2d6de63a7..432fc5f49567 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -1432,6 +1432,7 @@ union bpf_attr {
              __u32           attach_flags;
              __aligned_u64   prog_ids;
              __u32           prog_cnt;
+             __aligned_u64   prog_attach_flags; /* output: per-program attach_flags */
      } query;

      struct { /* anonymous struct used by BPF_RAW_TRACEPOINT_OPEN command */
@@ -5911,6 +5912,10 @@ struct bpf_prog_info {
      __u64 run_cnt;
      __u64 recursion_misses;
      __u32 verified_insns;
+     /* BTF ID of the function to attach to within BTF object identified
+      * by btf_id.
+      */
+     __u32 attach_btf_func_id;
 } __attribute__((aligned(8)));

 struct bpf_map_info {
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index a959cdd22870..08a1015ee09e 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -1074,6 +1074,7 @@ static int cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
 static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
                            union bpf_attr __user *uattr)
 {
+     __u32 __user *prog_attach_flags = u64_to_user_ptr(attr->query.prog_attach_flags);
      __u32 __user *prog_ids = u64_to_user_ptr(attr->query.prog_ids);
      enum bpf_attach_type type = attr->query.attach_type;
      enum cgroup_bpf_attach_type atype;
@@ -1081,50 +1082,92 @@ static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
      struct hlist_head *progs;
      struct bpf_prog *prog;
      int cnt, ret = 0, i;
+     int total_cnt = 0;
      u32 flags;

-     atype = to_cgroup_bpf_attach_type(type);
-     if (atype < 0)
-             return -EINVAL;
+     enum cgroup_bpf_attach_type from_atype, to_atype;

-     progs = &cgrp->bpf.progs[atype];
-     flags = cgrp->bpf.flags[atype];
+     if (type == BPF_LSM_CGROUP) {
+             from_atype = CGROUP_LSM_START;
+             to_atype = CGROUP_LSM_END;
+     } else {
+             from_atype = to_cgroup_bpf_attach_type(type);
+             if (from_atype < 0)
+                     return -EINVAL;
+             to_atype = from_atype;
+     }

-     effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
-                                           lockdep_is_held(&cgroup_mutex));
+     for (atype = from_atype; atype <= to_atype; atype++) {
+             progs = &cgrp->bpf.progs[atype];
+             flags = cgrp->bpf.flags[atype];

-     if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
-             cnt = bpf_prog_array_length(effective);
-     else
-             cnt = prog_list_length(progs);
+             effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
+                                                   lockdep_is_held(&cgroup_mutex));

-     if (copy_to_user(&uattr->query.attach_flags, &flags, sizeof(flags)))
-             return -EFAULT;
-     if (copy_to_user(&uattr->query.prog_cnt, &cnt, sizeof(cnt)))
+             if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
+                     total_cnt += bpf_prog_array_length(effective);
+             else
+                     total_cnt += prog_list_length(progs);
+     }
+
+     if (type != BPF_LSM_CGROUP)
+             if (copy_to_user(&uattr->query.attach_flags, &flags, sizeof(flags)))
+                     return -EFAULT;
+     if (copy_to_user(&uattr->query.prog_cnt, &total_cnt, sizeof(total_cnt)))
              return -EFAULT;
-     if (attr->query.prog_cnt == 0 || !prog_ids || !cnt)
+     if (attr->query.prog_cnt == 0 || !prog_ids || !total_cnt)
              /* return early if user requested only program count + flags */
              return 0;
-     if (attr->query.prog_cnt < cnt) {
-             cnt = attr->query.prog_cnt;
+
+     if (attr->query.prog_cnt < total_cnt) {
+             total_cnt = attr->query.prog_cnt;
              ret = -ENOSPC;
      }

-     if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE) {
-             return bpf_prog_array_copy_to_user(effective, prog_ids, cnt);
-     } else {
-             struct bpf_prog_list *pl;
-             u32 id;
+     for (atype = from_atype; atype <= to_atype; atype++) {
+             if (total_cnt <= 0)
+                     break;

-             i = 0;
-             hlist_for_each_entry(pl, progs, node) {
-                     prog = prog_list_prog(pl);
-                     id = prog->aux->id;
-                     if (copy_to_user(prog_ids + i, &id, sizeof(id)))
-                             return -EFAULT;
-                     if (++i == cnt)
-                             break;
+             progs = &cgrp->bpf.progs[atype];
+             flags = cgrp->bpf.flags[atype];
+
+             effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
+                                                   lockdep_is_held(&cgroup_mutex));
+
+             if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
+                     cnt = bpf_prog_array_length(effective);
+             else
+                     cnt = prog_list_length(progs);
+
+             if (cnt >= total_cnt)
+                     cnt = total_cnt;
+
+             if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE) {
+                     ret = bpf_prog_array_copy_to_user(effective, prog_ids, cnt);
+             } else {
+                     struct bpf_prog_list *pl;
+                     u32 id;
+
+                     i = 0;
+                     hlist_for_each_entry(pl, progs, node) {
+                             prog = prog_list_prog(pl);
+                             id = prog->aux->id;
+                             if (copy_to_user(prog_ids + i, &id, sizeof(id)))
+                                     return -EFAULT;
+                             if (++i == cnt)
+                                     break;
+                     }
              }
+
+             if (prog_attach_flags)
+                     for (i = 0; i < cnt; i++)
+                             if (copy_to_user(prog_attach_flags + i, &flags, sizeof(flags)))
+                                     return -EFAULT;
+
+             prog_ids += cnt;
+             total_cnt -= cnt;
+             if (prog_attach_flags)
+                     prog_attach_flags += cnt;
      }
      return ret;
 }
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 5ed2093e51cc..4137583c04a2 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -3520,7 +3520,7 @@ static int bpf_prog_detach(const union bpf_attr *attr)
      }
 }

-#define BPF_PROG_QUERY_LAST_FIELD query.prog_cnt
+#define BPF_PROG_QUERY_LAST_FIELD query.prog_attach_flags

 static int bpf_prog_query(const union bpf_attr *attr,
                        union bpf_attr __user *uattr)
@@ -3556,6 +3556,7 @@ static int bpf_prog_query(const union bpf_attr *attr,
      case BPF_CGROUP_SYSCTL:
      case BPF_CGROUP_GETSOCKOPT:
      case BPF_CGROUP_SETSOCKOPT:
+     case BPF_LSM_CGROUP:
              return cgroup_bpf_prog_query(attr, uattr);
      case BPF_LIRC_MODE2:
              return lirc_prog_query(attr, uattr);
@@ -4066,6 +4067,7 @@ static int bpf_prog_get_info_by_fd(struct file *file,

      if (prog->aux->btf)
              info.btf_id = btf_obj_id(prog->aux->btf);
+     info.attach_btf_func_id = prog->aux->attach_btf_id;
Note that exposing prog->aux->attach_btf_id only may not be enough
unless it can assume info.attach_btf_id is always referring to btf_vmlinux
for all bpf prog types.
We also export btf_id two lines above, right? Btw, I left a comment in
the bpftool about those btf_ids, I'm not sure how resolve them and
always assume vmlinux for now.
yeah, that btf_id above is the cgroup-lsm prog's btf_id which has its
func info, line info...etc.   It is not the one the attach_btf_id correspond
to.  attach_btf_id refers to either aux->attach_btf or aux->dst_prog's btf (or
target btf id here).

It needs a consensus on where this attach_btf_id, target btf id, and
prog_attach_flags should be.  If I read the patch 7 thread correctly,
I think Andrii is suggesting to expose them to userspace through link, so
potentially putting them in bpf_link_info.  The bpf_prog_query will
output a list of link ids.  The same probably applies to
Yep and I think it makes sense because link is representing one
specific attachment (and I presume flags can be stored inside the link
itself as well, right?).

But if legacy non-link BPF_PROG_ATTACH is supported then using
bpf_link_info won't cover legacy prog-only attachments.
I don't have any attachment to the legacy apis, I'm supporting them
only because it takes two lines of code; we can go link-only if there
is an agreement that it's inherently better.

How about I keep sys_bpf(BPF_PROG_QUERY) as is and I do a loop in the
userspace (for BPF_LSM_CGROUP only) over all links
(BPF_LINK_GET_NEXT_ID) and will find the the ones with matching prog
ids (BPF_LINK_GET_FD_BY_ID+BPF_OBJ_GET_INFO_BY_FD)?

That way we keep new fields in bpf_link_info, but we don't have to
extend sys_bpf(BPF_PROG_QUERY) because there doesn't seem to be a good
way to do it. Exporting links via new link_fds would mean we'd have to
support BPF_F_QUERY_EFFECTIVE, but getting an effective array of links
seems to be messy. If, in the future, we figure out a better way to
expose a list of attached/effective links per cgroup, we can
convert/optimize bpftool.
Why not use iter/bpf_link program (see progs/bpf_iter_bpf_link.c for
an example) instead? Once you have struct bpf_link and you know it's
cgroup link, you can cast it to struct bpf_cgroup_link and get access
to prog and cgroup. From cgroup to cgroup_bpf you can even get access
to effective array. Basically whatever kernel has access to you can
have access to from bpftool without extending any UAPIs.
WDYT?
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help