Thread (44 messages) 44 messages, 7 authors, 2020-02-12

Re: BPF LSM and fexit [was: [PATCH bpf-next v3 04/10] bpf: lsm: Add mutable hooks list for the BPF LSM]

From: Jann Horn <jannh@google.com>
Date: 2020-02-11 21:33:06
Also in: bpf, lkml

On Tue, Feb 11, 2020 at 9:33 PM Jann Horn [off-list ref] wrote:
On Tue, Feb 11, 2020 at 9:10 PM Alexei Starovoitov
[off-list ref] wrote:
quoted
On Tue, Feb 11, 2020 at 08:36:18PM +0100, Jann Horn wrote:
quoted
On Tue, Feb 11, 2020 at 8:09 PM Alexei Starovoitov
[off-list ref] wrote:
quoted
On Tue, Feb 11, 2020 at 07:44:05PM +0100, Jann Horn wrote:
quoted
On Tue, Feb 11, 2020 at 6:58 PM Alexei Starovoitov
[off-list ref] wrote:
quoted
On Tue, Feb 11, 2020 at 01:43:34PM +0100, KP Singh wrote:
[...]
quoted
quoted
* When using the semantic provided by fexit, the BPF LSM program will
  always be executed and will be able to override / clobber the
  decision of LSMs which appear before it in the ordered list. This
  semantic is very different from what we currently have (i.e. the BPF
  LSM hook is only called if all the other LSMs allow the action) and
  seems to be bypassing the LSM framework.
It that's a concern it's trivial to add 'if (RC == 0)' check to fexit
trampoline generator specific to lsm progs.
[...]
quoted
Using fexit mechanism and bpf_sk_storage generalization is
all that is needed. None of it should touch security/*.
[...]
quoted
Some of the lsm hooks are in critical path. Like security_socket_sendmsg().
retpoline hurts. If we go with indirect calls right now it will be harder to
optimize later. It took us long time to come up with bpf trampoline and build
bpf dispatcher on top of it to remove single indirect call from XDP runtime.
For bpf+lsm would be good to avoid it from the start.
Just out of curiosity: Are fexit hooks really much cheaper than indirect calls?

AFAIK ftrace on x86-64 replaces the return pointer for fexit
instrumentation (see prepare_ftrace_return()). So when the function
returns, there is one return misprediction for branching into
return_to_handler(), and then the processor's internal return stack
will probably be misaligned so that after ftrace_return_to_handler()
is done running, all the following returns will also be mispredicted.

So I would've thought that fexit hooks would have at least roughly the
same impact as indirect calls - indirect calls via retpoline do one
mispredicted branch, fexit hooks do at least two AFAICS. But I guess
indirect calls could still be slower if fexit benefits from having all
the mispredicted pointers stored on the cache-hot stack while the
indirect branch target is too infrequently accessed to be in L1D, or
something like that?
Ah, kpsingh explained to me that BPF trampolines work differently from
normal ftrace and don't do the return address poking. Still, the
processor's internal call stack will be misaligned by the BPF
trampoline, right? Since there is one more call than return, if e.g.
security_blah() is hooked, then the kernel will probably predict
security_blah+5 as the target when returning from the trampoline to
security_blah's parent, then when returning from security_blah's
parent to the grandparent predict the parent, and so on, right?
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help