Thread (26 messages) 26 messages, 7 authors, 2021-01-29

Re: [PATCH bpf] x86/bpf: handle bpf-program-triggered exceptions properly

From: Alexei Starovoitov <hidden>
Date: 2021-01-29 02:34:00

On Thu, Jan 28, 2021 at 06:18:37PM -0800, Andy Lutomirski wrote:
On Thu, Jan 28, 2021 at 5:53 PM Alexei Starovoitov
[off-list ref] wrote:
quoted
On Thu, Jan 28, 2021 at 05:31:35PM -0800, Andy Lutomirski wrote:
quoted
What exactly could the fault code even do to fix this up?  Something like:

if (addr == 0 && SMAP off && error_code says it's kernel mode && we
don't have permission to map NULL) {
  special care for bpf;
}
right. where 'special care' is checking extable and skipping single
load instruction.
quoted
This seems arbitrary and fragile.  And it's not obviously
implementable safely without taking locks,
search_bpf_extables() needs rcu_read_lock() only.
Not the locks you're talking about.
I mean the locks in the if statement.  How am I supposed to tell
whether this fault is a special bpf fault or a normal page fault
without taking a lock to look up the VMA or to do some other hack?
search_bpf_extables() only needs a faulting rip.
No need to lookup vma.
if (addr == 0 && search_bpf_extables(regs->ip)...
is trivial enough and won't penalize page faults in general.
These conditions are not going to happen in the normal kernel code.
Also, why should BPF get special dispensation to generate a special
kind of nonsensical page fault that no other kernel code is allowed to
do?
bpf is special, since it cares about performance a lot
and saving branches in critical path is important.
quoted
quoted
which we really ought not
to be doing from inside arbitrary bpf programs.
Not in arbitrary progs. It's only available to progs loaded by root.
quoted
Keep in mind that, if
SMAP is unavailable, the fault code literally does not know whether
the page fault originated form a valid uaccess region.

There's also always the possibility that you simultaneously run bpf
and something like Wine or DOSEMU2 that actually maps the zero page,
in which case the effect of the bpf code is going to be quite erratic
and will depend on which mm is loaded.
It's tracing. Folks who write those progs accepted the fact that
the data returned by probe_read is not going to be 100% accurate.
Is this really all tracing or is some of it actual live network code?
Networking bpf progs don't use this. bpf tracing does.
But I'm not sure why you're asking.
Tracing has higher performance demands than networking.
quoted
bpf jit can insert !null check before every such special load
(since it knows all of them), but it's an obvious performance loss
that would be good to avoid. If fault handler can do this
if (addr == 0 && ...) search_bpf_extables()
at least in some conditions and cpu flags it's already a win.
In all other cases bpf jit will insert !null checks.
Again, there is no guarantee that a page fault is even generated for
this.  And it doesn't seem very reasonable for the fault code to have
to decide whether a NULL pointer dereference is a special BPF fault or
a real user access fault against a VMA at address 9.
The faulting address and faulting ip will precisely identify this situation.
There is no guess work.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help