Thread (20 messages) 20 messages, 4 authors, 2023-12-25

Re: [PATCH bpf-next 1/3] bpf: implement relay map basis

From: Philo Lu <hidden>
Date: 2023-12-23 02:54:57
Also in: bpf


On 2023/12/22 22:45, Jiri Olsa wrote:
On Fri, Dec 22, 2023 at 08:21:44PM +0800, Philo Lu wrote:

SNIP
quoted
+/* bpf_attr is used as follows:
+ * - key size: must be 0
+ * - value size: value will be used as directory name by map_update_elem
+ *   (to create relay files). If passed as 0, it will be set to NAME_MAX as
+ *   default
+ *
+ * - max_entries: subbuf size
+ * - map_extra: subbuf num, default as 8
+ *
+ * When alloc, we do not set up relay files considering dir_name conflicts.
+ * Instead we use relay_late_setup_files() in map_update_elem(), and thus the
+ * value is used as dir_name, and map->name is used as base_filename.
+ */
+static struct bpf_map *relay_map_alloc(union bpf_attr *attr)
+{
+	struct bpf_relay_map *rmap;
+
+	if (unlikely(attr->map_flags & ~RELAY_CREATE_FLAG_MASK))
+		return ERR_PTR(-EINVAL);
+
+	/* key size must be 0 in relay map */
+	if (unlikely(attr->key_size))
+		return ERR_PTR(-EINVAL);
+
+	if (unlikely(attr->value_size > NAME_MAX)) {
+		pr_warn("value_size should be no more than %d\n", NAME_MAX);
+		return ERR_PTR(-EINVAL);
+	} else if (attr->value_size == 0)
+		attr->value_size = NAME_MAX;
the concept of no key with just value seems strange.. I never worked
with relay channels, so perhaps stupid question: but why not have one
relay channel for given key? having the debugfs like:

   /sys/kernel/debug/my_rmap/mychannel/<cpu>
Here, a relay map is actually a relay channel, which includes buffers 
for all cpus. And I think 2 levels is enough when we use relay map in 
`/sys/kernel/debug/`: <dir_name>/<map_name>[#cpu]. The `dir_name` is 
necessary because user could use the same `map_name` in different bpf 
programs, and we can use it as an additional tag to distinguish them. 
The `dir_name` is set by user with relay_map_update_elem.

Here is an example. Assume we have 2 relay maps (rmap_a and rmap_b) and 
2 cpus, the debugfs will be like:
/sys/kernel/debug/<dir_name1>/rmap_a0
/sys/kernel/debug/<dir_name1>/rmap_a1
/sys/kernel/debug/<dir_name2>/rmap_b0
/sys/kernel/debug/<dir_name2>/rmap_b1
So I think the key point here is that we just need one field to set the 
`dir_name`, either key or value. I chose key as NULL because I think it 
suggests "Normally map_update_elem should be invoked just once for a 
relay map". But I think it okay to use key instead, and value as NULL.
quoted
+
+	/* set default subbuf num */
+	attr->map_extra = attr->map_extra & UINT_MAX;
+	if (!attr->map_extra)
+		attr->map_extra = 8;
+
+	if (!attr->map_name || strlen(attr->map_name) == 0)
attr->map_name is allways != NULL
quoted
+		return ERR_PTR(-EINVAL);
+
+	rmap = bpf_map_area_alloc(sizeof(*rmap), NUMA_NO_NODE);
+	if (!rmap)
+		return ERR_PTR(-ENOMEM);
+
+	bpf_map_init_from_attr(&rmap->map, attr);
+
+	rmap->relay_cb.create_buf_file = create_buf_file_handler;
+	rmap->relay_cb.remove_buf_file = remove_buf_file_handler;
+	if (attr->map_flags & BPF_F_OVERWRITE)
+		rmap->relay_cb.subbuf_start = subbuf_start_overwrite;
+
+	rmap->relay_chan = relay_open(NULL, NULL,
+							attr->max_entries, attr->map_extra,
+							&rmap->relay_cb, NULL);
wrong indentation
Got it. I will adjust it.
quoted
+	if (!rmap->relay_chan)
+		return ERR_PTR(-EINVAL);
+
+	return &rmap->map;
+}
+
+static void relay_map_free(struct bpf_map *map)
+{
+	struct bpf_relay_map *rmap;
+
+	rmap = container_of(map, struct bpf_relay_map, map);
+	relay_close(rmap->relay_chan);
+	debugfs_remove_recursive(rmap->relay_chan->parent);
+	kfree(rmap);
should you use bpf_map_area_free instead?
Thanks for catching. Will fix it.
jirka
quoted
+}
+
+static void *relay_map_lookup_elem(struct bpf_map *map, void *key)
+{
+	return ERR_PTR(-EOPNOTSUPP);
+}
+
+static long relay_map_update_elem(struct bpf_map *map, void *key, void *value,
+				   u64 flags)
+{
+	return -EOPNOTSUPP;
+}
+
+static long relay_map_delete_elem(struct bpf_map *map, void *key)
+{
+	return -EOPNOTSUPP;
+}
+
+static int relay_map_get_next_key(struct bpf_map *map, void *key,
+				    void *next_key)
+{
+	return -EOPNOTSUPP;
+}
+
+static u64 relay_map_mem_usage(const struct bpf_map *map)
+{
+	struct bpf_relay_map *rmap;
+	u64 usage = sizeof(struct bpf_relay_map);
+
+	rmap = container_of(map, struct bpf_relay_map, map);
+	usage += sizeof(struct rchan);
+	usage += (sizeof(struct rchan_buf) + rmap->relay_chan->alloc_size)
+			 * num_online_cpus();
+	return usage;
+}
+
+BTF_ID_LIST_SINGLE(relay_map_btf_ids, struct, bpf_relay_map)
+const struct bpf_map_ops relay_map_ops = {
+	.map_meta_equal = bpf_map_meta_equal,
+	.map_alloc = relay_map_alloc,
+	.map_free = relay_map_free,
+	.map_lookup_elem = relay_map_lookup_elem,
+	.map_update_elem = relay_map_update_elem,
+	.map_delete_elem = relay_map_delete_elem,
+	.map_get_next_key = relay_map_get_next_key,
+	.map_mem_usage = relay_map_mem_usage,
+	.map_btf_id = &relay_map_btf_ids[0],
+};
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 1bf9805ee185..35ae54ac6736 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1147,6 +1147,7 @@ static int map_create(union bpf_attr *attr)
  	}
  
  	if (attr->map_type != BPF_MAP_TYPE_BLOOM_FILTER &&
+	    attr->map_type != BPF_MAP_TYPE_RELAY &&
  	    attr->map_extra != 0)
  		return -EINVAL;
  
-- 
2.32.0.3.g01195cf9f
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help