Re: Closing the BPF map permission loophole

From: Roberto Sassu <hidden>
Date: 2022-09-30 09:57:28
Also in: bpf

On Thu, 2022-09-29 at 18:30 -0400, Paul Moore wrote:

On Thu, Sep 29, 2022 at 3:55 AM Roberto Sassu
[off-list ref] wrote:

quoted

On Wed, 2022-09-28 at 20:24 -0400, Paul Moore wrote:

quoted

I only became aware of this when the LSM list was CC'd so I'm a
little
behind on what is going on here ... looking quickly through the
mailing list archive it looks like there is an issue with BPF map
permissions not matching well with their associated fd
permissions,
yes?  From a LSM perspective, there are a couple of hooks that
currently use the fd's permissions (read/write) to determine the
appropriate access control check.

From what I understood, access control on maps is done in two
steps.
First, whenever someone attempts to get a fd to a map
security_bpf_map() is called. LSM implementations could check
access if
the current process has the right to access the map (whose label
can be
assigned at map creation time with security_bpf_map_alloc()).

[NOTE: SELinux is currently the only LSM which provides BPF access
controls, so they are going to be my example here and in the rest of
this email.]

In the case of SELinux, security_bpf_map() does check the access
between the current task and the BPF map itself (which inherits its
security label from its creator), with the actual permission
requested
being determined by the fmode_t parameter passed to the LSM hook.
Looking at the current BPF code, the callers seem to take that from
various different places
(bpf_attr:{file_flags,map_flags,open_flags}).
This could be due solely to the different operations being done by
the
callers, which would make me believe everything is correct, but given
this thread it seems reasonable for someone with a better
understanding of BPF than me to double check this.  Can you help
verify that everything is okay here?

I also don't have concerns on this part. eBPF maintainers can help to
confirm.

Second, whenever the holder of the obtained fd wants to do an

quoted

operation
on the map (lookup, update, delete, ...), eBPF checks if the fd
modes
are compatible with the operation to perform (e.g. lookup requires
FMODE_CAN_READ).

To be clear, from what I can see, it looks like the LSM is not
checking the fd modes, but rather the modes stored in the bpf_attr,
which I get the impression do not always match the fd modes.  Yes?
No?

Correct, there is no revalidation. Fd modes represent what LSMs granted
to the fd holder. eBPF allows operations on the object behind the fd
depending on the stored fd modes (for maps, not sure about the other
object types).

There is also LSM/SELinux code which checks the permissions when a
BPF
map is passed between tasks via a fd.  Currently the SELinux check
only looks at the file:f_mode to get the permissions to check, but if
the f_mode bits are not the authoritative record of what is allowed
in
the BPF map, perhaps we need to change that to use one of the
bpf_attr
mode bits (my gut feeling is bpf_attr:open_flags)?

quoted

One problem is that the second part is missing for some operations
dealing with the map fd:

Map iterators:
https://lore.kernel.org/bpf/20220906170301.256206-1-roberto.sassu@huaweicloud.com/ (local)

You'll need to treat me like an idiot when it comes to BPF maps ;)

Argh, sorry!

I did a very quick read on them right now and it looks like a BPF map
iterator would just be a combination of BPF read and execute ("bpf {
map_read prog_run }" in SELinux policy terms).  Would it make more
sense to just use the existing security_bpf_map() and
security_bpf_prog() hooks here?

If I can make an example, if you have a linked list, a map iterator is
like calling a function for each element of the list, allowing the
function to read and write the element.

For now I didn't think about programs. Lorenz, which reported this
issue in the first place, is planning on rethinking access control on
all eBPF objects.

When you create a map iterator, you pass the program you want to run
and a map reference to bpftool (user space). bpftool first retrieves
the fds for the program and the map (thus, security_bpf_prog() and
security_bpf_map() are called). The fds are then passed to the kernel
with another system call to create the map iterator. The map iterator
is a named kernel object, with a corresponding file in the bpf
filesystem. Reading that file causes the iteration to start, by running
the program for each map element.

The only missing part is checking the fd modes granted by LSMs at the
time the map iterator is created. If the map iterator allows read and
write, both FMODE_CAN_READ and FMODE_CAN_WRITE should be set.

quoted

Map fd directly used by eBPF programs without system call:
https://lore.kernel.org/bpf/20220926154430.1552800-1-roberto.sassu@huaweicloud.com/ (local)

Another instance of "can you please explain this use case?" ;)

Lorenz discovered that. Despite LSMs check which permissions user space
requested for a map fd, there is no corresponding check of the fd
modes. Normally, a map fd is passed in a system call to perform a map
operation. In this case, it is not. It is set in the program code, and
eBPF transforms the map fd into a map address suitable to be passed to
kernel helpers which directly access the kernel object.

I'm not going to hazard too much of a guess here, but if the map is
being passed between tasks and a fd is generated from that map, we
may
be able to cover this with logic similar
security/selinux/hooks.c:bpf_fd_pass() ... but I'm really stretching
my weak understanding of BPF here.

quoted

Another problem is that there is no DAC, only MAC (work in
progress). I
don't know exactly the status of enabling unprivileged eBPF.

It is my opinion that we need to ensure both DAC and MAC are present
in the code.  This thread makes me worry that some eBPF DAC controls
are being ignored because one can currently say "we're okay because
you need privilege!".  That may be true today, but I can imagine a
time in the not too distant future where unpriv eBPF is allowed and
we
suddenly have to bolt on a lot of capable() checks ... which is a
great recipe for privilege escalation bugs.

quoted

Apart from this, now the discussion is focusing on the following
problem. A map (kernel object) can be referenced in two ways: by ID
or
by path. By ID requires CAP_ADMIN, so we can consider by path for
now.

Given a map fd, the holder of that fd can create a new reference
(pinning) to the map in the bpf filesystem (a new file whose
private
data contains the address of the kernel object).

Pinning a map does not have a corresponding permission. Any fd mode
is
sufficient to do the operation. Furthermore, subsequent requests to
obtain a map fd by path could result in receiving a read-write fd,
while at the time of pinning the fd was read-only.

Since the maps carry their own label I think we are mostly okay, even
if the map is passed between tasks by some non-fd related mechanism.
However, I am now slightly worried that if a fd is obtained with
perms
that don't match the underlying map and that fd is then passed to
another task the access control check on the fd passing would not be
correct.  Operations on the map from a SELinux perspective should
still be okay (the map has its own label), but still.

I'm wondering if we do want to move the SELinux BPF fd passing code
to
check the bpf_attr:open_flags perms.  Thoughts?

Seems correct as it is, but I'm not completely familiar with this work.
As long as you ensure that the fd holder has the right to access the
map, then it will be still responsibility of eBPF to just check the fd
modes.

While this does not seem to me a concern from MAC perspective, as

quoted

attempts to get a map fd still have to pass through
security_bpf_map(),
in general this should be fixed without relying on LSMs.

Agreed.  The access controls need to work both for DAC and DAC+LSM.

@all, what do you think?

Is the plan to ensure that the map and fd permissions are correct

quoted

at
the core BPF level, or do we need to do some additional checks in
the
LSMs (currently only SELinux)?

Should we add a new map_pin permission in SELinux?

Maybe?  Maybe not?  I don't know, can you help me understand map
pinning a bit more first?

I'm not completely sure that this is correct. Pinning a map seems to me
like creating a new dentry for the inode.

Should we have DAC to restrict pinnning without LSMs?

Similar to above.

If we had DAC, even without restricting pinning, fds have to be
obtained if the DAC check passed (if we don't want to rely exclusively

on MAC). Even if pinning was done with a read-only fd, a new read-write fd can be obtained if the owner allowed you to do so.

Restricting pinning might be risky to break compatibility with existing
applications.

Roberto

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help