Re: [RFC PATCH bpf-next 10/15] bpf: Use bpf_map_pages_alloc in ringbuf
From: Andrii Nakryiko <hidden>
Date: 2022-08-02 18:00:37
Also in:
bpf, linux-mm
On Tue, Aug 2, 2022 at 6:31 AM Yafang Shao [off-list ref] wrote:
On Tue, Aug 2, 2022 at 7:17 AM Andrii Nakryiko [off-list ref] wrote:quoted
On Fri, Jul 29, 2022 at 8:23 AM Yafang Shao [off-list ref] wrote:quoted
Introduce new helper bpf_map_pages_alloc() for this memory allocation. Signed-off-by: Yafang Shao <redacted> --- include/linux/bpf.h | 4 ++++ kernel/bpf/ringbuf.c | 27 +++++++++------------------ kernel/bpf/syscall.c | 41 +++++++++++++++++++++++++++++++++++++++++ 3 files changed, 54 insertions(+), 18 deletions(-)[...]quoted
/* Each data page is mapped twice to allow "virtual" * continuous read of samples wrapping around the end of ring@@ -95,16 +95,10 @@ static struct bpf_ringbuf *bpf_ringbuf_area_alloc(struct bpf_map *map, if (!pages) return NULL; - for (i = 0; i < nr_pages; i++) { - page = alloc_pages_node(numa_node, flags, 0); - if (!page) { - nr_pages = i; - goto err_free_pages; - } - pages[i] = page; - if (i >= nr_meta_pages) - pages[nr_data_pages + i] = page; - } + ptr = bpf_map_pages_alloc(map, pages, nr_meta_pages, nr_data_pages, + numa_node, flags, 0); + if (!ptr)bpf_map_pages_alloc() has some weird and confusing interface. It fills out pages (second argument) and also returns pages as void *. Why not just return int error (0 or -ENOMEM)? You are discarding this ptr anyways.I will change it.quoted
But also thinking some more, bpf_map_pages_alloc() is very ringbuf specific (which other map will have exactly the same meaning for nr_meta_pages and nr_data_pages, where we also allocate 2 * nr_data_pages, etc). I don't think it makes sense to expose it as a generic internal API. Why not keep all that inside kernel/bpf/ringbuf.c instead?Right, it is used in ringbuf.c only currently. I will keep it inside ringbuf.c.
In such case you might as well put pages = bpf_map_area_alloc(); part
into this function and return struct page ** as a result, so that
everything related to pages is handled as a single unit. And then
bpf_map_pages_free() will free not just each individual page, but also
struct page*[] array.
Also please call it something ringbuf specific, e.g.,
bpf_ringbuf_pages_{alloc,free}()?
-- Regards Yafang