Re: [PATCH bpf] bpf: 'fix' for undefined future potential exploits of BPF_PROG_LOAD
From: Maciej Żenczykowski <hidden>
Date: 2026-01-03 16:10:44
Also in:
bpf, lkml
On Sat, Jan 3, 2026 at 1:14 AM Alexei Starovoitov [off-list ref] wrote:
quoted
I am actually aware of it, but we cannot use sysctl_unprivileged_bpf_disabled, because (last I checked) it disables map creation as well,yes, because we had bugs in maps too. prog_load has a bigger bug surface, but map_create can have issues too.
Yes, of course, bugs happen in all sorts of spots in the kernel, they're unavoidable in general, all we can do is try to limit our exposure to as many of them as possible - by putting in various barriers. That logic is why we have things like layered sandboxes. I think you'll agree with me that it is a lot easier to catch/fix/understand the bpf map related code than it is to understand issues with verifier/jit. It's also significantly easier to test/fuzz map related stuff. Anyway, in a sense it doesn't matter. BPF map memory consumption is a significant problem. As such while we can require program loading at boot, being unable to dynamically create (inner) maps after the fact is a way to limit permanent memory use, for potentially unused (or lightly used) programs. (Side note: it would be nice if we could somehow swap in a map into an existing program at run time without it being in a 1-element outer array... perhaps we'd need to flag such maps as run time replacable [provided types match], or something)
quoted
I don't believe so. How are you suggesting we globally block BPF_PROG_LOAD, while there will still be some CAP_SYS_ADMIN processes out of necessity, and without blocking map creation?Sounds like you don't trust root, yet believe that map_create is safe for unpriv?!
FYI, we don't blindly trust kernel ring zero either (AFAIK on some devices the hypervisor will actually audit all new ring 0 executable pages, which is difficult with bpf)... The 'unpriv' we're talking about here is not truly unpriv - it's just less privileged. It's still dedicated signed system code running in a dedicated selinux domain, with sepolicy restricting map_create to those domains. It's just that the restrictions on bpf access are wider than on bpf map creation, which in turn are wider than on bpf program loading. There's various levels of restrictions. Some of it is uid/gid based, some sepolicy, etc.
I cannot recommend such a security posture to anyone.
Yes, obviously, allowing random apps any access to eBPF is a recipe for disaster. Bad enough they have access to cBPF.
Use LSM to block prog_load or use bpf token with userns for fine grained access.
I hope you're aware (last I checked, which was a half year ago or so) BPF LSM doesn't work due to being buggy (there's a hidden requirement to enable DYNAMIC FTRACE, without which it is non functional - at least on x86-64, likely all archs) - trying to attach a BPF LSM hook unconditionally fails with EBUSY on such a kernel configuration. I reported that here on the mailing list, search for "6.12.30 x86_64 BPF_LSM doesn't work without (?) fentry/mcount config options" (Aug 22, 2025) - you were cc'ed on the thread.