Re: [RFC bpf-next] libbpf: increase rlimit before trying to create BPF maps

From: Daniel Borkmann <daniel@iogearbox.net>
Date: 2018-11-02 18:14:53

On 11/01/2018 06:18 PM, Quentin Monnet wrote:

2018-10-30 15:23 UTC+0000 ~ Quentin Monnet [off-list ref]

quoted

The limit for memory locked in the kernel by a process is usually set to
64 bytes by default. This can be an issue when creating large BPF maps.
A workaround is to raise this limit for the current process before
trying to create a new BPF map. Changing the hard limit requires the
CAP_SYS_RESOURCE and can usually only be done by root user (but then
only root can create BPF maps).

Sorry, the parenthesis is not correct: non-root users can in fact create
BPF maps as well. If a non-root user calls the function to create a map,
setrlimit() will fail silently (but set errno), and the program will
simply go on with its rlimit unchanged.

quoted

As far as I know there is not API to get the current amount of memory
locked for a user, therefore we cannot raise the limit only when
required. One solution, used by bcc, is to try to create the map, and on
getting a EPERM error, raising the limit to infinity before giving
another try. Another approach, used in iproute, is to raise the limit in
all cases, before trying to create the map.

Here we do the same as in iproute2: the rlimit is raised to infinity
before trying to load the map.

I send this patch as a RFC to see if people would prefer the bcc
approach instead, or the rlimit change to be in bpftool rather than in
libbpf.

I'd avoid doing something like this in a generic library; it's basically an
ugly hack for the kind of accounting we're doing and only shows that while
this was "good enough" to start off with in the early days, we should be
doing something better today if every application raises it to inf anyway
then it's broken. :) It just shows that this missed its purpose. Similarly
to the jit_limit discussion on rlimit, perhaps we should be considering
switching to something else entirely from kernel side. Could be something
like memcg but this definitely needs some more evaluation first. (Meanwhile
I'd not change the lib but callers instead and once we have something better
in place we remove this type of "raising to inf" from the tree ...)

quoted

Signed-off-by: Quentin Monnet <redacted>
---
 tools/lib/bpf/bpf.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index 03f9bcc4ef50..456a5a7b112c 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c

@@ -26,6 +26,8 @@
 #include <unistd.h>
 #include <asm/unistd.h>
 #include <linux/bpf.h>
+#include <sys/resource.h>
+#include <sys/types.h>
 #include "bpf.h"
 #include "libbpf.h"
 #include <errno.h>

@@ -68,8 +70,11 @@ static inline int sys_bpf(enum bpf_cmd cmd, union bpf_attr *attr,
 int bpf_create_map_xattr(const struct bpf_create_map_attr *create_attr)
 {
 	__u32 name_len = create_attr->name ? strlen(create_attr->name) : 0;
+	struct rlimit rinf = { RLIM_INFINITY, RLIM_INFINITY };
 	union bpf_attr attr;
 
+	setrlimit(RLIMIT_MEMLOCK, &rinf);
+
 	memset(&attr, '\0', sizeof(attr));
 
 	attr.map_type = create_attr->map_type;

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help