[PATCH RFC net-next 00/14] BPF syscall, maps, verifier, samples · Alexei Starovoitov <hidden> · 2014-06-28
[PATCH RFC net-next 01/14] net: filter: split filter.c into two files · Alexei Starovoitov <hidden> · 2014-06-28
[PATCH RFC net-next 05/14] bpf: add lookup/update/delete/iterate methods to BPF maps · Alexei Starovoitov <hidden> · 2014-06-28
[PATCH RFC net-next 06/14] bpf: add hashtable type of BPF maps · Alexei Starovoitov <hidden> · 2014-06-28
[PATCH RFC net-next 08/14] bpf: add eBPF verifier · Alexei Starovoitov <hidden> · 2014-06-28
Re: [PATCH RFC net-next 08/14] bpf: add eBPF verifier · Andy Lutomirski <luto@amacapital.net> · 2014-06-28
Re: [PATCH RFC net-next 08/14] bpf: add eBPF verifier · Alexei Starovoitov <hidden> · 2014-06-28
Re: [PATCH RFC net-next 08/14] bpf: add eBPF verifier · Andy Lutomirski <luto@amacapital.net> · 2014-06-29
Re: [PATCH RFC net-next 08/14] bpf: add eBPF verifier · Alexei Starovoitov <hidden> · 2014-06-29
Re: [PATCH RFC net-next 08/14] bpf: add eBPF verifier · Daniel Borkmann <hidden> · 2014-07-01
Re: [PATCH RFC net-next 08/14] bpf: add eBPF verifier · Alexei Starovoitov <hidden> · 2014-07-01
RE: [PATCH RFC net-next 08/14] bpf: add eBPF verifier · David Laight <hidden> · 2014-07-02
Re: [PATCH RFC net-next 08/14] bpf: add eBPF verifier · Alexei Starovoitov <hidden> · 2014-07-02
Re: [PATCH RFC net-next 08/14] bpf: add eBPF verifier · Chema Gonzalez <hidden> · 2014-07-02
Re: [PATCH RFC net-next 08/14] bpf: add eBPF verifier · Alexei Starovoitov <hidden> · 2014-07-02
Re: [PATCH RFC net-next 08/14] bpf: add eBPF verifier · Chema Gonzalez <hidden> · 2014-07-02
Re: [PATCH RFC net-next 08/14] bpf: add eBPF verifier · Alexei Starovoitov <hidden> · 2014-07-03
RE: [PATCH RFC net-next 08/14] bpf: add eBPF verifier · David Laight <hidden> · 2014-07-03
Re: [PATCH RFC net-next 08/14] bpf: add eBPF verifier · Alexei Starovoitov <hidden> · 2014-07-03
[PATCH RFC net-next 10/14] net: sock: allow eBPF programs to be attached to sockets · Alexei Starovoitov <hidden> · 2014-06-28
[PATCH RFC net-next 14/14] samples: bpf: example of tracing filters with eBPF · Alexei Starovoitov <hidden> · 2014-06-28
[PATCH RFC net-next 13/14] samples: bpf: example of stateful socket filtering · Alexei Starovoitov <hidden> · 2014-06-28
Re: [PATCH RFC net-next 13/14] samples: bpf: example of stateful socket filtering · Andy Lutomirski <luto@amacapital.net> · 2014-06-28
Re: [PATCH RFC net-next 13/14] samples: bpf: example of stateful socket filtering · Alexei Starovoitov <hidden> · 2014-06-28
[PATCH RFC net-next 12/14] samples: bpf: add mini eBPF library to manipulate maps and programs · Alexei Starovoitov <hidden> · 2014-06-28
[PATCH RFC net-next 11/14] tracing: allow eBPF programs to be attached to events · Alexei Starovoitov <hidden> · 2014-06-28
Re: [PATCH RFC net-next 11/14] tracing: allow eBPF programs to be attached to events · Daniel Borkmann <hidden> · 2014-07-01
Re: [PATCH RFC net-next 11/14] tracing: allow eBPF programs to be attached to events · Alexei Starovoitov <hidden> · 2014-07-01
[PATCH RFC net-next 09/14] bpf: allow eBPF programs to use maps · Alexei Starovoitov <hidden> · 2014-06-28
[PATCH RFC net-next 07/14] bpf: expand BPF syscall with program load/unload · Alexei Starovoitov <hidden> · 2014-06-28
Re: [PATCH RFC net-next 07/14] bpf: expand BPF syscall with program load/unload · Andy Lutomirski <luto@amacapital.net> · 2014-06-28
Re: [PATCH RFC net-next 07/14] bpf: expand BPF syscall with program load/unload · Alexei Starovoitov <hidden> · 2014-06-28
Re: [PATCH RFC net-next 07/14] bpf: expand BPF syscall with program load/unload · Andy Lutomirski <luto@amacapital.net> · 2014-06-28
Re: [PATCH RFC net-next 07/14] bpf: expand BPF syscall with program load/unload · Alexei Starovoitov <hidden> · 2014-06-28
Re: [PATCH RFC net-next 07/14] bpf: expand BPF syscall with program load/unload · Greg KH <gregkh@linuxfoundation.org> · 2014-06-28
Re: [PATCH RFC net-next 07/14] bpf: expand BPF syscall with program load/unload · Andy Lutomirski <luto@amacapital.net> · 2014-06-28
Re: [PATCH RFC net-next 07/14] bpf: expand BPF syscall with program load/unload · Alexei Starovoitov <hidden> · 2014-06-30
RE: [PATCH RFC net-next 07/14] bpf: expand BPF syscall with program load/unload · David Laight <hidden> · 2014-06-30
[PATCH RFC net-next 04/14] bpf: update MAINTAINERS entry · Alexei Starovoitov <hidden> · 2014-06-28
Re: [PATCH RFC net-next 04/14] bpf: update MAINTAINERS entry · Joe Perches <joe@perches.com> · 2014-06-28
Re: [PATCH RFC net-next 04/14] bpf: update MAINTAINERS entry · Alexei Starovoitov <hidden> · 2014-06-28
[PATCH RFC net-next 03/14] bpf: introduce syscall(BPF, ...) and BPF maps · Alexei Starovoitov <hidden> · 2014-06-28
Re: [PATCH RFC net-next 03/14] bpf: introduce syscall(BPF, ...) and BPF maps · Andy Lutomirski <luto@amacapital.net> · 2014-06-28
Re: [PATCH RFC net-next 03/14] bpf: introduce syscall(BPF, ...) and BPF maps · Alexei Starovoitov <hidden> · 2014-06-28
Re: [PATCH RFC net-next 03/14] bpf: introduce syscall(BPF, ...) and BPF maps · Andy Lutomirski <luto@amacapital.net> · 2014-06-28
Re: [PATCH RFC net-next 03/14] bpf: introduce syscall(BPF, ...) and BPF maps · Alexei Starovoitov <hidden> · 2014-06-28
Re: [PATCH RFC net-next 03/14] bpf: introduce syscall(BPF, ...) and BPF maps · Andy Lutomirski <luto@amacapital.net> · 2014-06-28
Re: [PATCH RFC net-next 03/14] bpf: introduce syscall(BPF, ...) and BPF maps · Alexei Starovoitov <hidden> · 2014-06-28
Re: [PATCH RFC net-next 03/14] bpf: introduce syscall(BPF, ...) and BPF maps · Andy Lutomirski <luto@amacapital.net> · 2014-06-29
Re: [PATCH RFC net-next 03/14] bpf: introduce syscall(BPF, ...) and BPF maps · Alexei Starovoitov <hidden> · 2014-06-29
Re: [PATCH RFC net-next 03/14] bpf: introduce syscall(BPF, ...) and BPF maps · Andy Lutomirski <luto@amacapital.net> · 2014-06-30
Re: [PATCH RFC net-next 03/14] bpf: introduce syscall(BPF, ...) and BPF maps · Alexei Starovoitov <hidden> · 2014-07-01
Re: [PATCH RFC net-next 03/14] bpf: introduce syscall(BPF, ...) and BPF maps · Andy Lutomirski <luto@amacapital.net> · 2014-07-01
Re: [PATCH RFC net-next 03/14] bpf: introduce syscall(BPF, ...) and BPF maps · Alexei Starovoitov <hidden> · 2014-07-02
Re: [PATCH RFC net-next 03/14] bpf: introduce syscall(BPF, ...) and BPF maps · Andy Lutomirski <luto@amacapital.net> · 2014-07-03
Re: [PATCH RFC net-next 03/14] bpf: introduce syscall(BPF, ...) and BPF maps · Alexei Starovoitov <hidden> · 2014-07-03
Re: [PATCH RFC net-next 03/14] bpf: introduce syscall(BPF, ...) and BPF maps · Andy Lutomirski <luto@amacapital.net> · 2014-07-04
Re: [PATCH RFC net-next 03/14] bpf: introduce syscall(BPF, ...) and BPF maps · Alexei Starovoitov <hidden> · 2014-07-05
[PATCH RFC net-next 02/14] net: filter: split filter.h and expose eBPF to user space · Alexei Starovoitov <hidden> · 2014-06-28
Re: [PATCH RFC net-next 00/14] BPF syscall, maps, verifier, samples · Kees Cook <hidden> · 2014-06-30
Re: [PATCH RFC net-next 00/14] BPF syscall, maps, verifier, samples · Daniel Borkmann <hidden> · 2014-07-01
Re: [PATCH RFC net-next 00/14] BPF syscall, maps, verifier, samples · Kees Cook <hidden> · 2014-07-02

Re: [PATCH RFC net-next 00/14] BPF syscall, maps, verifier, samples

From: Kees Cook <hidden>
Date: 2014-07-02 16:39:10
Also in: linux-api, lkml

On Tue, Jul 1, 2014 at 12:18 AM, Daniel Borkmann [off-list ref] wrote:

On 07/01/2014 01:09 AM, Kees Cook wrote:

quoted

On Fri, Jun 27, 2014 at 5:05 PM, Alexei Starovoitov [off-list ref]
wrote:

quoted

Hi All,

this patch set demonstrates the potential of eBPF.

First patch "net: filter: split filter.c into two files" splits eBPF
interpreter
out of networking into kernel/bpf/. The goal for BPF subsystem is to be
usable
in NET-less configuration. Though the whole set is marked is RFC, the 1st
patch
is good to go. Similar version of the patch that was posted few weeks
ago, but
was deferred. I'm assuming due to lack of forward visibility. I hope that
this
patch set shows what eBPF is capable of and where it's heading.

Other patches expose eBPF instruction set to user space and introduce
concepts
of maps and programs accessible via syscall.

'maps' is a generic storage of different types for sharing data between
kernel
and userspace. Maps are referrenced by global id. Root can create
multiple
maps of different types where key/value are opaque bytes of data. It's up
to
user space and eBPF program to decide what they store in the maps.

eBPF programs are similar to kernel modules. They live in global space
and
have unique prog_id. Each program is a safe run-to-completion set of
instructions. eBPF verifier statically determines that the program
terminates
and safe to execute. During verification the program takes a hold of maps
that it intends to use, so selected maps cannot be removed until program
is
unloaded. The program can be attached to different events. These events
can
be packets, tracepoint events and other types in the future. New event
triggers
execution of the program which may store information about the event in
the maps.
Beyond storing data the programs may call into in-kernel helper functions
which may, for example, dump stack, do trace_printk or other forms of
live
kernel debugging. Same program can be attached to multiple events.
Different
programs can access the same map:

   tracepoint  tracepoint  tracepoint    sk_buff    sk_buff
    event A     event B     event C      on eth0    on eth1
     |             |          |            |          |
     |             |          |            |          |
     --> tracing <--      tracing       socket      socket
          prog_1           prog_2       prog_3      prog_4
          |  |               |            |
       |---  -----|  |-------|           map_3
     map_1       map_2

User space (via syscall) and eBPF programs access maps concurrently.

Last two patches are sample code. 1st demonstrates stateful packet
inspection.
It counts tcp and udp packets on eth0. Should be easy to see how this
eBPF
framework can be used for network analytics.
2nd sample does simple 'drop monitor'. It attaches to kfree_skb
tracepoint
event and counts number of packet drops at particular $pc location.
User space periodically summarizes what eBPF programs recorded.
In these two samples the eBPF programs are tiny and written in
'assembler'
with macroses. More complex programs can be written C (llvm backend is
not
part of this diff to reduce 'huge' perception).
Since eBPF is fully JITed on x64, the cost of running eBPF program is
very
small even for high frequency events. Here are the numbers comparing
flow_dissector in C vs eBPF:
   x86_64 skb_flow_dissect() same skb (all cached)         -  42 nsec per
call
   x86_64 skb_flow_dissect() different skbs (cache misses) - 141 nsec per
call
eBPF+jit skb_flow_dissect() same skb (all cached)         -  51 nsec per
call
eBPF+jit skb_flow_dissect() different skbs (cache misses) - 135 nsec per
call

Detailed explanation on eBPF verifier and safety is in patch 08/14


This is very exciting! Thanks for working on it. :)

Between the new eBPF syscall and the new seccomp syscall, I'm really
looking forward to using lookup tables for seccomp filters. Under
certain types of filters, we'll likely see some non-trivial
performance improvements.

Well, if I read this correctly, the eBPF syscall lets you set up maps, etc,
but the only way to attach eBPF is via setsockopt for network filters right
now (and via tracing). Seccomp will still make use of classic BPF, so you
won't be able to use it there.

Currently, yes. But once this is in, and the new seccomp syscall is
in, we can add a SECCOMP_FILTER_EBPF flag to the "flags" field to
instruct seccomp to load an eBPF instead of a classic BPF. I'm excited
for the future. :)

-Kees

-- 
Kees Cook
Chrome OS Security

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help