Re: [PATCH bpf] bpf: 'fix' for undefined future potential exploits of BPF_PROG_LOAD

From: Maciej Żenczykowski <hidden>
Date: 2026-01-03 16:10:44
Also in: bpf, lkml

On Sat, Jan 3, 2026 at 1:14 AM Alexei Starovoitov
[off-list ref] wrote:

quoted

I am actually aware of it, but we cannot use sysctl_unprivileged_bpf_disabled,
because (last I checked) it disables map creation as well,

yes, because we had bugs in maps too. prog_load has a bigger
bug surface, but map_create can have issues too.

Yes, of course, bugs happen in all sorts of spots in the kernel,
they're unavoidable in general, all we can do is try to limit our
exposure to as many of them as possible - by putting in various
barriers.  That logic is why we have things like layered sandboxes.

I think you'll agree with me that it is a lot easier to
catch/fix/understand the bpf map related code than it is to understand
issues with verifier/jit.  It's also significantly easier to test/fuzz
map related stuff.

Anyway, in a sense it doesn't matter.  BPF map memory consumption is a
significant problem.  As such while we can require program loading at
boot, being unable to dynamically create (inner) maps after the fact
is a way to limit permanent memory use, for potentially unused (or
lightly used) programs.

(Side note: it would be nice if we could somehow swap in a map into an
existing program at run time without it being in a 1-element outer
array... perhaps we'd need to flag such maps as run time replacable
[provided types match], or something)

quoted

I don't believe so.  How are you suggesting we globally block BPF_PROG_LOAD,
while there will still be some CAP_SYS_ADMIN processes out of necessity,
and without blocking map creation?

Sounds like you don't trust root, yet believe that map_create is safe
for unpriv?!

FYI, we don't blindly trust kernel ring zero either (AFAIK on some
devices the hypervisor will actually audit all new ring 0 executable
pages, which is difficult with bpf)...

The 'unpriv' we're talking about here is not truly unpriv - it's just
less privileged.  It's still dedicated signed system code running in a
dedicated selinux domain, with sepolicy restricting map_create to
those domains.  It's just that the restrictions on bpf access are
wider than on bpf map creation, which in turn are wider than on bpf
program loading.  There's various levels of restrictions. Some of it
is uid/gid based, some sepolicy, etc.

I cannot recommend such a security posture to anyone.

Yes, obviously, allowing random apps any access to eBPF is a recipe
for disaster.
Bad enough they have access to cBPF.

Use LSM to block prog_load or use bpf token with userns for fine grained access.

I hope you're aware (last I checked, which was a half year ago or so)
BPF LSM doesn't work due to being buggy (there's a hidden requirement
to enable DYNAMIC FTRACE, without which it is non functional - at
least on x86-64, likely all archs) - trying to attach a BPF LSM hook
unconditionally fails with EBUSY on such a kernel configuration.

I reported that here on the mailing list, search for "6.12.30 x86_64
BPF_LSM doesn't work without (?) fentry/mcount config options" (Aug
22, 2025) - you were cc'ed on the thread.

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help