Thread (59 messages) 59 messages, 8 authors, 2024-01-16

Re: [PATCH bpf-next 03/29] bpf: introduce BPF token object

From: Christian Brauner <brauner@kernel.org>
Date: 2024-01-11 10:38:24
Also in: bpf, linux-fsdevel, linux-security-module

quoted
The current check is inconsisent. It special-cases init_user_ns. The
correct thing to do for what you're intending imho is:

bool bpf_token_capable(const struct bpf_token *token, int cap)
{
        struct user_namespace *userns = &init_user_ns;

        if (token)
                userns = token->userns;
        if (ns_capable(userns, cap))
                return true;
        return cap != CAP_SYS_ADMIN && ns_capable(userns, CAP_SYS_ADMIN))

}
Unfortunately the above becomes significantly more hairy when LSM
(security_bpf_token_capable) gets involved, while preserving the rule
"if token doesn't give rights, fall back to init userns checks".
Why? Please explain your reasoning in detail.
I'm happy to accommodate any implementation of bpf_token_capable() as
long as it behaves as discussed above and also satisfies Paul's
requirement that capability checks should happen before LSM checks.
quoted
Because any caller located in an ancestor user namespace of
token->user_ns will be privileged wrt to the token's userns as long as
they have that capability in their user namespace.
And with `current_user_ns() == token->userns` check we won't be using
token->userns while the caller is in ancestor user namespace, we'll
use capable() check, which will succeed only in init user_ns, assuming
corresponding CAP_xxx is actually set.
Why? This isn't how any of our ns_capable() logic works.

This basically argues that anyone in an ancestor user namespace is not
allowed to operate on any of their descendant child namespaces unless
they are in the init_user_ns.

But that's nonsense as I'm trying to tell you. Any process in an
ancestor user namespace that is privileged over the child namespace can
just setns() into it and then pass that bpf_token_capable() check by
supplying the token.

At this point just do it properly and allow callers that are privileged
in the token user namespace to load bpf programs. It also means you get
user namespace nesting done properly.
quoted
For example, if the caller is in the init_user_ns and permissions
for CAP_WHATEVER is checked for in token->user_ns and the caller has
CAP_WHATEVER in init_user_ns then they also have it in all
descendant user namespaces.
Right, so if they didn't use a token they would still pass
capable(CAP_WHATEVER), right?
Yes, I'm trying to accomodate your request but making it work
consistently.
quoted
The original intention had been to align with what we require during
token creation meaning that once a token has been created interacting
with this token is specifically confined to caller's located in the
token's user namespace.

If that's not the case then it doesn't make sense to not allow
permission checking based on regular capability semantics. IOW, why
special case init_user_ns if you're breaking the confinement restriction
anyway.
I'm sorry, perhaps I'm dense, but with `current_user_ns() ==
token->userns` check I think we do fulfill the intention to not allow
using a token in a userns different from the one in which it was
created. If that condition isn't satisfied, the token is immediately
My request originally was about never being able to interact with a
token outside of that userns. This is different as you provide an escape
hatch to init_user_ns. But if you need that and ignore the token then
please do it properly. That's what I'm trying to tell you. See below.
ignored. So you can't use a token from another userns for anything,
it's just not there, effectively.

And as I tried to explain above, I do think that ignoring the token
instead of erroring out early is what we want to provide good
user-space ecosystem integration of BPF token.
There is no erroring out early in. It's:

(1) Has a token been provided and is the caller capable wrt to the
    namespace of the token? Any caller in an ancestor user namespace
    that has the capability in that user namespace is capable wrt to
    that token. That __includes__ a callers in the init_user_ns. IOW,
    you don't need to fallback to any special checking for init_user_ns.
    It is literally covered in the if (token) branch with the added
    consistency that a process in an ancestor user namespace is
    privileged wrt to that token as well.

(2) No token has been provided. Then do what we always did and perform
    the capability checks based on the initial user namespace.

The only thing that you then still need is add that token_capable() hook
in there:

bool bpf_token_capable(const struct bpf_token *token, int cap)
{
	bool has_cap;
        struct user_namespace *userns = &init_user_ns;

        if (token)
                userns = token->userns;
        if (ns_capable(userns, cap))
                return true;
        if (cap != CAP_SYS_ADMIN && ns_capable(userns, CAP_SYS_ADMIN))
		return token ? security_bpf_token_capable(token, cap) == 0 : true;
	return false;
}

Or write it however you like. I think this is way more consistent and
gives you a more flexible permission model.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help