Re: [PATCH bpf-next 0/4] Reduce overhead of LSMs with static calls

From: Casey Schaufler <casey@schaufler-ca.com>
Date: 2023-02-13 18:14:41
Also in: bpf

On 2/12/2023 2:00 PM, Paul Moore wrote:

On Fri, Feb 10, 2023 at 9:32 PM Casey Schaufler [off-list ref] wrote:

quoted

On 2/10/2023 12:03 PM, Paul Moore wrote:

quoted

On Thu, Feb 9, 2023 at 11:56 AM Kees Cook [off-list ref] wrote:

quoted

On Fri, Jan 27, 2023 at 03:16:38PM -0500, Paul Moore wrote:

quoted

On Thu, Jan 19, 2023 at 6:10 PM KP Singh [off-list ref] wrote:

quoted

# Background

LSM hooks (callbacks) are currently invoked as indirect function calls. These
callbacks are registered into a linked list at boot time as the order of the
LSMs can be configured on the kernel command line with the "lsm=" command line
parameter.

Thanks for sending this KP.  I had hoped to make a proper pass through
this patchset this week but I ended up getting stuck trying to wrap my
head around some network segmentation offload issues and didn't quite
make it here.  Rest assured it is still in my review queue.

However, I did manage to take a quick look at the patches and one of
the first things that jumped out at me is it *looks* like this
patchset is attempting two things: fix a problem where one LSM could
trample another (especially problematic with the BPF LSM due to its
nature), and reduce the overhead of making LSM calls.  I realize that
in this patchset the fix and the optimization are heavily
intermingled, but I wonder what it would take to develop a standalone
fix using the existing indirect call approach?  I'm guessing that is
going to potentially be a pretty significant patch, but if we could
add a little standardization to the LSM hooks without adding too much
in the way of code complexity or execution overhead I think that might
be a win independent of any changes to how we call the hooks.

Of course this could be crazy too, but I'm the guy who has to ask
these questions :)

Hm, I am expecting this patch series to _not_ change any semantics of
the LSM "stack". I would agree: nothing should change in this series, as
it should be strictly a mechanical change from "iterate a list of
indirect calls" to "make a series of direct calls". Perhaps I missed
a logical change?

I might be missing something too, but I'm thinking of patch 4/4 in
this series that starts with this sentence:

 "BPF LSM hooks have side-effects (even when a default value is
  returned), as some hooks end up behaving differently due to
  the very presence of the hook."

My understanding of the current "agreement" is that we keep BPF
hooks at the end for this very reason.

It would be nice to not have these conventions.  I get that it's the
only knob we have at the moment to tweak, but I would hope that we
could do better in the future.

Agreed. I don't care much for it.

Ignoring the static call changes for a moment, I'm curious what it
would look like to have a better mechanism for handling things like
this.  What would it look like if we expanded the individual LSM error
reporting back to the LSM layer to have a bit more information, e.g.
"this LSM erred, but it is safe to continue evaluating other LSMs" and
"this LSM erred, and it was too severe to continue evaluating other
LSMs"?  Similarly, would we want to expand the hook registration to
have more info, e.g. "run this hook even when other LSMs have failed"
and "if other LSMs have failed, do not run this hook"?

quoted

I really don't want another LSM to have sway over Smack enforcement.

I think we can all agree that the one LSM should not have the ability
to affect the operation of another, especially when it would cause the
violation of a different LSM's security policy.

quoted

I would hate to see, for example, an LSM decide that because it has
initialized an inode no other LSM should be allowed to, even in an
error situation. There are really only two options Call all the hooks
every time and either succeed on all or report the most important
error. Or, "bail on fail", and acknowledge that following hooks may
not be called. Really, does "I failed, but it's not that important"
make sense as a return value?

Of the two things I tossed out, richer return values and richer hook
registration, perhaps it's really only the latter, richer hook
registration that is important here.  It would allow a LSM to indicate
to the LSM hook layer how the individual hook implementation should be
called: always, or only if previously called implementations have not
failed.  I believe that should eliminate any worry of a BPF LSM, or
any LSM for that matter, from impacting the security policy of
another.  However, I will admit that I haven't spent the necessary
amount of time chasing down all the hooks to verify if that is 100%
correct.

quoted

I realize that loading a BPF LSM is a privileged operation so we've
largely mitigated the risk there, but with stacking on it's way to
being more full featured, and IMA slowly working its way to proper LSM
status, it seems to me like having a richer, and proper way to handle
individual LSM failures would be a good thing.  I feel like patch 4/4
definitely hints at this, but I could be mistaken.

We have bigger issues with BPF. There's nothing to prevent BPF from
implementing a secid_to_secctx() hook and making a system with SELinux
go cattywampus. BPF is stacked as if it isn't a "major" LSM, while
allowing it to do "major" LSM things. One reason we need full stacking
is to address this.

That's a different issue, and one of the reasons why I suggested
taking an all-or-nothing approach to stacking many years ago, but ...
well, you know how that worked out.  I promise to not keep saying "I
told you so" if you promise to not keep bringing up LSM stacking as
the answer to all that ails you ;)

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help