Re: [PATCH v13 3/5] security: Replace indirect LSM hook calls with static calls

From: KP Singh <kpsingh@kernel.org>
Date: 2024-07-04 00:24:17
Also in: bpf

On Thu, Jul 4, 2024 at 2:04 AM Casey Schaufler [off-list ref] wrote:

On 7/3/2024 4:08 PM, KP Singh wrote:

quoted

On Thu, Jul 4, 2024 at 12:52 AM Paul Moore [off-list ref] wrote:

quoted

On Wed, Jul 3, 2024 at 6:22 PM KP Singh [off-list ref] wrote:

quoted

On Wed, Jul 3, 2024 at 10:56 PM Paul Moore [off-list ref] wrote:

quoted

On Wed, Jul 3, 2024 at 12:55 PM KP Singh [off-list ref] wrote:

quoted

On Wed, Jul 3, 2024 at 2:07 AM Paul Moore [off-list ref] wrote:

quoted

On Jun 29, 2024 KP Singh [off-list ref] wrote:

quoted

LSM hooks are currently invoked from a linked list as indirect calls
which are invoked using retpolines as a mitigation for speculative
attacks (Branch History / Target injection) and add extra overhead which
is especially bad in kernel hot paths:

[...]

quoted

should fix the more obvious problems.  I'd like to know if you are
aware of any others?  If not, the text above should be adjusted and
we should reconsider patch 5/5.  If there are other problems I'd
like to better understand them as there may be an independent
solution for those particular problems.

We did have problems with some other hooks but I was unable to dig up
specific examples though, it's been a while. More broadly speaking, a
default decision is still a decision. Whereas the intent from the BPF
LSM is not to make a default decision unless a BPF program is loaded.
I am quite worried about the holes this leaves open, subtle bugs
(security or crashes) we have not caught yet and PATCH 5/5 engineers away
 the problem of the "default decision".

The inode/xattr problem you originally mentioned wasn't really rooted
in a "bad" default return value, it was really an issue with how the
LSM hook was structured due to some legacy design assumptions made
well before the initial stacking patches were merged.  That should be
fixed now[1] and given that the inode/xattr set/remove hooks were
unique in this regard (individual LSMs were responsible for performing
the capabilities checks) I don't expect this to be a general problem.

There were also some issues caused by the fact that we were defining
the default return value in multiple places and these values had gone
out of sync in a number of hooks.  We've also fixed this problem by
only defining the default return value once for each hook, solving all
of those problems.

I don't see how this solves problems or prevents any future problems
with side-effects. I have always been uncomfortable with an extraneous
function being called with a side effect ever since we merged BPF LSM
with default callback. We have found one bug due to this, not all the
bugs.

You've got to give me something more concrete than that.  If you can't
provide any concrete examples, start with providing a basic concept
with far more detail than just "side-effects".

quoted

I'm not aware of any other existing problems relating to the LSM hook
default values, if there are any, we need to fix them independent of
this patchset.  The LSM framework should function properly if the
"default" values are used.

Patch 5 eliminates the possibilities of errors and subtle bugs all
together. The problem with subtle bugs is, well, they are subtle, if
you and I knew of the bugs, we would fix all of them, but we don't. I
really feel we ought to eliminate the class of issues and not just
whack-a-mole when we see the bugs.

Here's the thing, I don't really like patch 5/5.  To be honest, I
don't really like a lot of this patchset.  From my perspective, the
complexity of the code is likely going to mean more maintenance
headaches down the road, but Linus hath spoken so we're doing this
(although "this" is still a bit undefined as far as I'm concerned).
If you want me to merge patch 5/5 you've got to give me something real
and convincing that can't be fixed by any other means.  My current
opinion is that you're trying to use a previously fixed bug to scare
and/or coerce the merging of some changes I don't really want to
merge.  If you want me to take patch 5/5, you've got to give me a
reason that is far more compelling that what you've written thus far.

Paul, I am not scaring you, I am providing a solution that saves us
from headaches with side-effects and bugs in the future. It's safer by
design.

You say you have not reviewed it carefully, but you did ask me to move
the function from the BPF LSM layer to an LSM API, and we had a bunch
of discussion around naming in the subsequent revisions.

https://lore.kernel.org/bpf/f7e8a16b0815d9d901e019934d684c5f@paul-moore.com/ (local)

My reasons are:

1. It's safer, no side effects, guaranteed to be not buggy. Neither
you, nor me, can guarantee that a default value will be safe in the
LSM layer. I request others (Casey, Kees) for their opinion here too.
2. Performance, no extra function call.

I want to be very careful about the comments I make on this patch set.
I can't say that I trust any fix for the BPF LSM layer. My natural
inclination is to isolate the fix to the area that has the problem,
that is, BPF. I have a hard time accepting the notion that the implementation
will really fix all possible bugs in the future. The pace at which eBPF
is evolving gives me the heebee geebees when I think of it as a mechanism
for implementing security modules.

My biggest concern is that we may be trying too hard for perfection.
I see a situation where we're not moving forward because there are two
reasonable solutions and rather than running with either we're desperately
looking for a compelling reason to pick one over the other.

I am fine with either, if you folks prefer security_toggle_hook to be
in BPF only / limited to BPF. I can revert back to what we had in v9,
the changes to the LSM layer are then very minimal.

- KP

quoted

If you still don't like it, it's your call, I would still like to keep
most of the logic local to the BPF LSM as proposed in
https://lore.kernel.org/bpf/f7e8a16b0815d9d901e019934d684c5f@paul-moore.com/ (local)

- KP

quoted

--
paul-moore.com

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help