Thread (17 messages) 17 messages, 5 authors, 2020-06-03

Re: [PATCH bpf-next RFC 2/3] bpf: devmap dynamic map-value storage area based on BTF

From: Alexei Starovoitov <hidden>
Date: 2020-06-02 18:27:17
Also in: bpf

On Tue, Jun 2, 2020 at 12:00 AM Jesper Dangaard Brouer
[off-list ref] wrote:
On Mon, 1 Jun 2020 14:30:12 -0700
Alexei Starovoitov [off-list ref] wrote:
quoted
On Fri, May 29, 2020 at 05:59:45PM +0200, Jesper Dangaard Brouer wrote:
quoted
+
+/* Expected BTF layout that match struct bpf_devmap_val */
+static const struct expect layout[] = {
+   {BTF_KIND_INT,          true,    0,      4,     "ifindex"},
+   {BTF_KIND_UNION,        false,  32,      4,     "bpf_prog"},
+   {BTF_KIND_STRUCT,       false,  -1,     -1,     "storage"}
+};
+
+static int dev_map_check_btf(const struct bpf_map *map,
+                        const struct btf *btf,
+                        const struct btf_type *key_type,
+                        const struct btf_type *value_type)
+{
+   struct bpf_dtab *dtab = container_of(map, struct bpf_dtab, map);
+   u32 found_members_cnt = 0;
+   u32 int_data;
+   int off;
+   u32 i;
+
+   /* Validate KEY type and size */
+   if (BTF_INFO_KIND(key_type->info) != BTF_KIND_INT)
+           return -EOPNOTSUPP;
+
+   int_data = *(u32 *)(key_type + 1);
+   if (BTF_INT_BITS(int_data) != 32 || BTF_INT_OFFSET(int_data) != 0)
+           return -EOPNOTSUPP;
+
+   /* Validate VALUE have layout that match/map-to struct bpf_devmap_val
+    * - With a flexible size of member 'storage'.
+    */
+
+   if (BTF_INFO_KIND(value_type->info) != BTF_KIND_STRUCT)
+           return -EOPNOTSUPP;
+
+   /* Struct/union members in BTF must not exceed (max) expected members */
+   if (btf_type_vlen(value_type) > ARRAY_SIZE(layout))
+                   return -E2BIG;
+
+   for (i = 0; i < ARRAY_SIZE(layout); i++) {
+           off = btf_find_expect_layout_offset(btf, value_type, &layout[i]);
+
+           if (off < 0 && layout[i].mandatory)
+                   return -EUCLEAN;
+
+           if (off >= 0)
+                   found_members_cnt++;
+
+           /* Transfer layout config to map */
+           switch (i) {
+           case 0:
+                   dtab->cfg.btf_offset.ifindex = off;
+                   break;
+           case 1:
+                   dtab->cfg.btf_offset.bpf_prog = off;
+                   break;
+           default:
+                   break;
+           }
+   }
+
+   /* Detect if BTF/vlen have members that were not found */
+   if (btf_type_vlen(value_type) > found_members_cnt)
+           return -E2BIG;
+
+   return 0;
+}
This layout validation looks really weird to me.
That layout[] array sort of complements BTF to describe the data,
but double describe of the layout feels like hack.
This is the kind of feedback I'm looking for.  I want to make the
map-value more dynamic.  It seems so old school to keep extending the
map-value with a size and fixed binary layout, when we have BTF
available.  I'm open to input on how to better verify/parse/desc the
expected BTF layout for kernel-code side.

The patch demonstrates that this is possible, I'm open for changes.
E.g. devmap is now extended with a bpf_prog, but most end-users will
not be using this feature. Today they can use value_size=4 to avoid
using this field. When we extend map-value again, then end-users are
force into providing 'bpf_prog.fd' if they want to use the newer
options.  In this patch end-users don't need to provide 'bpf_prog' if
they don't use it. Via BTF we can see this struct member can be skipped.
I think 'struct bpf_devmap_val' should be in uapi/bpf.h.
That's what it is and it will be extended with new fields at the end
just like all other structs in uapi/bpf.h
I don't think BTF can become a substitute for uapi
where uapi struct has to have all fields defined and backwards supported
by the kernel.
BTF is for flexible structs where fields may disappear.
BTF is there to define a meaning of a binary blob.
'struct bpf_devmap_val' is not such thing. It's very much known with
fixed fields and fixed meaning.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help