Re: [PATCH v13 3/5] security: Replace indirect LSM hook calls with static calls
From: KP Singh <kpsingh@kernel.org>
Date: 2024-07-03 22:22:55
Also in:
bpf
On Wed, Jul 3, 2024 at 10:56 PM Paul Moore [off-list ref] wrote:
On Wed, Jul 3, 2024 at 12:55 PM KP Singh [off-list ref] wrote:quoted
On Wed, Jul 3, 2024 at 2:07 AM Paul Moore [off-list ref] wrote:quoted
On Jun 29, 2024 KP Singh [off-list ref] wrote:quoted
LSM hooks are currently invoked from a linked list as indirect calls which are invoked using retpolines as a mitigation for speculative attacks (Branch History / Target injection) and add extra overhead which is especially bad in kernel hot paths:[...]quoted
should fix the more obvious problems. I'd like to know if you are aware of any others? If not, the text above should be adjusted and we should reconsider patch 5/5. If there are other problems I'd like to better understand them as there may be an independent solution for those particular problems.We did have problems with some other hooks but I was unable to dig up specific examples though, it's been a while. More broadly speaking, a default decision is still a decision. Whereas the intent from the BPF LSM is not to make a default decision unless a BPF program is loaded. I am quite worried about the holes this leaves open, subtle bugs (security or crashes) we have not caught yet and PATCH 5/5 engineers away the problem of the "default decision".The inode/xattr problem you originally mentioned wasn't really rooted in a "bad" default return value, it was really an issue with how the LSM hook was structured due to some legacy design assumptions made well before the initial stacking patches were merged. That should be fixed now[1] and given that the inode/xattr set/remove hooks were unique in this regard (individual LSMs were responsible for performing the capabilities checks) I don't expect this to be a general problem. There were also some issues caused by the fact that we were defining the default return value in multiple places and these values had gone out of sync in a number of hooks. We've also fixed this problem by only defining the default return value once for each hook, solving all of those problems.
I don't see how this solves problems or prevents any future problems with side-effects. I have always been uncomfortable with an extraneous function being called with a side effect ever since we merged BPF LSM with default callback. We have found one bug due to this, not all the bugs.
I'm not aware of any other existing problems relating to the LSM hook default values, if there are any, we need to fix them independent of this patchset. The LSM framework should function properly if the "default" values are used.
Patch 5 eliminates the possibilities of errors and subtle bugs all together. The problem with subtle bugs is, well, they are subtle, if you and I knew of the bugs, we would fix all of them, but we don't. I really feel we ought to eliminate the class of issues and not just whack-a-mole when we see the bugs. The LSM framework ought to function with default values is a nice goal, but a weaker position than what we have where we make these subtle bugs impossible. If you feel this should not be a part of the framework, I would still like to revert back to implementation where we kept the toggling logic contained to the BPF LSM. - KP
[1] I just realized that commit 61df7b828204 doesn't properly update the removexattr implementations for SELinux and Smack, expect a patch for that soon. The current code is okay, it just does some unnecessary work (see the setxattr changes to get an idea of the changes needed).quoted
quoted
quoted
+#define lsm_for_each_hook(scall, NAME) \ + for (scall = static_calls_table.NAME; \ + scall - static_calls_table.NAME < MAX_LSM_COUNT; scall++) \ + if (static_key_enabled(&scall->active->key))This is probably a stupid question, but why use static_key_enabled() here instead of static_branch_unlikely() as in the call_XXX_macros?The static_key_enabled is a check for the key being enabled, whereas the static_branch_likely is for laying out the right assembly code (jump tables etc.) and based on the value of the static key, here we are not using the static calls or even keys, rather we are following back from direct calls to indirect calls and don't really need any jump tables in the slow path.Gotcha, thanks for the explanation. -- paul-moore.com