Re: [PATCH RFC net-next 03/14] bpf: introduce syscall(BPF, ...) and BPF maps

[PATCH RFC net-next 00/14] BPF syscall, maps, verifier, samples · Alexei Starovoitov <hidden> · 2014-06-28
[PATCH RFC net-next 01/14] net: filter: split filter.c into two files · Alexei Starovoitov <hidden> · 2014-06-28
[PATCH RFC net-next 05/14] bpf: add lookup/update/delete/iterate methods to BPF maps · Alexei Starovoitov <hidden> · 2014-06-28
[PATCH RFC net-next 06/14] bpf: add hashtable type of BPF maps · Alexei Starovoitov <hidden> · 2014-06-28
[PATCH RFC net-next 08/14] bpf: add eBPF verifier · Alexei Starovoitov <hidden> · 2014-06-28
Re: [PATCH RFC net-next 08/14] bpf: add eBPF verifier · Andy Lutomirski <luto@amacapital.net> · 2014-06-28
Re: [PATCH RFC net-next 08/14] bpf: add eBPF verifier · Alexei Starovoitov <hidden> · 2014-06-28
Re: [PATCH RFC net-next 08/14] bpf: add eBPF verifier · Andy Lutomirski <luto@amacapital.net> · 2014-06-29
Re: [PATCH RFC net-next 08/14] bpf: add eBPF verifier · Alexei Starovoitov <hidden> · 2014-06-29
Re: [PATCH RFC net-next 08/14] bpf: add eBPF verifier · Daniel Borkmann <hidden> · 2014-07-01
Re: [PATCH RFC net-next 08/14] bpf: add eBPF verifier · Alexei Starovoitov <hidden> · 2014-07-01
RE: [PATCH RFC net-next 08/14] bpf: add eBPF verifier · David Laight <hidden> · 2014-07-02
Re: [PATCH RFC net-next 08/14] bpf: add eBPF verifier · Alexei Starovoitov <hidden> · 2014-07-02
Re: [PATCH RFC net-next 08/14] bpf: add eBPF verifier · Chema Gonzalez <hidden> · 2014-07-02
Re: [PATCH RFC net-next 08/14] bpf: add eBPF verifier · Alexei Starovoitov <hidden> · 2014-07-02
Re: [PATCH RFC net-next 08/14] bpf: add eBPF verifier · Chema Gonzalez <hidden> · 2014-07-02
Re: [PATCH RFC net-next 08/14] bpf: add eBPF verifier · Alexei Starovoitov <hidden> · 2014-07-03
RE: [PATCH RFC net-next 08/14] bpf: add eBPF verifier · David Laight <hidden> · 2014-07-03
Re: [PATCH RFC net-next 08/14] bpf: add eBPF verifier · Alexei Starovoitov <hidden> · 2014-07-03
[PATCH RFC net-next 10/14] net: sock: allow eBPF programs to be attached to sockets · Alexei Starovoitov <hidden> · 2014-06-28
[PATCH RFC net-next 14/14] samples: bpf: example of tracing filters with eBPF · Alexei Starovoitov <hidden> · 2014-06-28
[PATCH RFC net-next 13/14] samples: bpf: example of stateful socket filtering · Alexei Starovoitov <hidden> · 2014-06-28
Re: [PATCH RFC net-next 13/14] samples: bpf: example of stateful socket filtering · Andy Lutomirski <luto@amacapital.net> · 2014-06-28
Re: [PATCH RFC net-next 13/14] samples: bpf: example of stateful socket filtering · Alexei Starovoitov <hidden> · 2014-06-28
[PATCH RFC net-next 12/14] samples: bpf: add mini eBPF library to manipulate maps and programs · Alexei Starovoitov <hidden> · 2014-06-28
[PATCH RFC net-next 11/14] tracing: allow eBPF programs to be attached to events · Alexei Starovoitov <hidden> · 2014-06-28
Re: [PATCH RFC net-next 11/14] tracing: allow eBPF programs to be attached to events · Daniel Borkmann <hidden> · 2014-07-01
Re: [PATCH RFC net-next 11/14] tracing: allow eBPF programs to be attached to events · Alexei Starovoitov <hidden> · 2014-07-01
[PATCH RFC net-next 09/14] bpf: allow eBPF programs to use maps · Alexei Starovoitov <hidden> · 2014-06-28
[PATCH RFC net-next 07/14] bpf: expand BPF syscall with program load/unload · Alexei Starovoitov <hidden> · 2014-06-28
Re: [PATCH RFC net-next 07/14] bpf: expand BPF syscall with program load/unload · Andy Lutomirski <luto@amacapital.net> · 2014-06-28
Re: [PATCH RFC net-next 07/14] bpf: expand BPF syscall with program load/unload · Alexei Starovoitov <hidden> · 2014-06-28
Re: [PATCH RFC net-next 07/14] bpf: expand BPF syscall with program load/unload · Andy Lutomirski <luto@amacapital.net> · 2014-06-28
Re: [PATCH RFC net-next 07/14] bpf: expand BPF syscall with program load/unload · Alexei Starovoitov <hidden> · 2014-06-28
Re: [PATCH RFC net-next 07/14] bpf: expand BPF syscall with program load/unload · Greg KH <gregkh@linuxfoundation.org> · 2014-06-28
Re: [PATCH RFC net-next 07/14] bpf: expand BPF syscall with program load/unload · Andy Lutomirski <luto@amacapital.net> · 2014-06-28
Re: [PATCH RFC net-next 07/14] bpf: expand BPF syscall with program load/unload · Alexei Starovoitov <hidden> · 2014-06-30
RE: [PATCH RFC net-next 07/14] bpf: expand BPF syscall with program load/unload · David Laight <hidden> · 2014-06-30
[PATCH RFC net-next 04/14] bpf: update MAINTAINERS entry · Alexei Starovoitov <hidden> · 2014-06-28
Re: [PATCH RFC net-next 04/14] bpf: update MAINTAINERS entry · Joe Perches <joe@perches.com> · 2014-06-28
Re: [PATCH RFC net-next 04/14] bpf: update MAINTAINERS entry · Alexei Starovoitov <hidden> · 2014-06-28
[PATCH RFC net-next 03/14] bpf: introduce syscall(BPF, ...) and BPF maps · Alexei Starovoitov <hidden> · 2014-06-28
Re: [PATCH RFC net-next 03/14] bpf: introduce syscall(BPF, ...) and BPF maps · Andy Lutomirski <luto@amacapital.net> · 2014-06-28
Re: [PATCH RFC net-next 03/14] bpf: introduce syscall(BPF, ...) and BPF maps · Alexei Starovoitov <hidden> · 2014-06-28
Re: [PATCH RFC net-next 03/14] bpf: introduce syscall(BPF, ...) and BPF maps · Andy Lutomirski <luto@amacapital.net> · 2014-06-28
Re: [PATCH RFC net-next 03/14] bpf: introduce syscall(BPF, ...) and BPF maps · Alexei Starovoitov <hidden> · 2014-06-28
Re: [PATCH RFC net-next 03/14] bpf: introduce syscall(BPF, ...) and BPF maps · Andy Lutomirski <luto@amacapital.net> · 2014-06-28
Re: [PATCH RFC net-next 03/14] bpf: introduce syscall(BPF, ...) and BPF maps · Alexei Starovoitov <hidden> · 2014-06-28
Re: [PATCH RFC net-next 03/14] bpf: introduce syscall(BPF, ...) and BPF maps · Andy Lutomirski <luto@amacapital.net> · 2014-06-29
Re: [PATCH RFC net-next 03/14] bpf: introduce syscall(BPF, ...) and BPF maps · Alexei Starovoitov <hidden> · 2014-06-29
Re: [PATCH RFC net-next 03/14] bpf: introduce syscall(BPF, ...) and BPF maps · Andy Lutomirski <luto@amacapital.net> · 2014-06-30
Re: [PATCH RFC net-next 03/14] bpf: introduce syscall(BPF, ...) and BPF maps · Alexei Starovoitov <hidden> · 2014-07-01
Re: [PATCH RFC net-next 03/14] bpf: introduce syscall(BPF, ...) and BPF maps · Andy Lutomirski <luto@amacapital.net> · 2014-07-01
Re: [PATCH RFC net-next 03/14] bpf: introduce syscall(BPF, ...) and BPF maps · Alexei Starovoitov <hidden> · 2014-07-02
Re: [PATCH RFC net-next 03/14] bpf: introduce syscall(BPF, ...) and BPF maps · Andy Lutomirski <luto@amacapital.net> · 2014-07-03
Re: [PATCH RFC net-next 03/14] bpf: introduce syscall(BPF, ...) and BPF maps · Alexei Starovoitov <hidden> · 2014-07-03
Re: [PATCH RFC net-next 03/14] bpf: introduce syscall(BPF, ...) and BPF maps · Andy Lutomirski <luto@amacapital.net> · 2014-07-04
Re: [PATCH RFC net-next 03/14] bpf: introduce syscall(BPF, ...) and BPF maps · Alexei Starovoitov <hidden> · 2014-07-05
[PATCH RFC net-next 02/14] net: filter: split filter.h and expose eBPF to user space · Alexei Starovoitov <hidden> · 2014-06-28
Re: [PATCH RFC net-next 00/14] BPF syscall, maps, verifier, samples · Kees Cook <hidden> · 2014-06-30
Re: [PATCH RFC net-next 00/14] BPF syscall, maps, verifier, samples · Daniel Borkmann <hidden> · 2014-07-01
Re: [PATCH RFC net-next 00/14] BPF syscall, maps, verifier, samples · Kees Cook <hidden> · 2014-07-02

From: Andy Lutomirski <hidden>
Date: 2014-07-04 15:17:49
Also in: linux-api, lkml

On Wed, Jul 2, 2014 at 7:29 PM, Alexei Starovoitov [off-list ref] wrote:

On Wed, Jul 2, 2014 at 6:43 PM, Andy Lutomirski [off-list ref] wrote:

quoted

On Tue, Jul 1, 2014 at 10:33 PM, Alexei Starovoitov [off-list ref] wrote:

quoted

I want to avoid string names, since they will force new 'strtab', 'symtab'
sections in the programs/maps and will uglify the user interface quite a bit.

To be fair, you really need to imitate ELF here.  A very simple
relocation-like table should do the trick.

simple.. right :) I do see the amount of struggle you have with
binutils and vdso.
I really don't want to add relocation unless this is last resort.
Especially since it can be solved without it.
I don't think I explained it enough in my last email… trying again:

quoted

Back in september one loadable unit was: one eBPF program + set of maps,
but tracing requirements forced a change, since multiple programs need
to access the same map and maps may need to be pre-populated before
the programs start executing, so I've split maps and programs into mostly
independent entities, but programs still need to think of maps as local:
For example I want to do a skb leak check 'tracing filter':
- attach this program to kretprobe of __alloc_skb():
  u64 key = (u64) skb;
  u64 value = bpf_get_time();
  bpf_update_map_elem(1/*const_map_id*/, &key, &value);
- attach this program to consume_skb and kfree_skb tracepoints:
  u64 key = (u64) skb;
  bpf_delete_map_elem(1/*const_map_id*/, &key);
- and have user space do:
  prior to loading:
  bpf_create_map(1/*map_id*/, 8/*key_size*/, 8/*value*/, 1M /*max_entries*/)
  and then periodically iterate the map to see whether any skb stayed
  in the map for too long.

Programs need to be written with hard coded map_ids otherwise usability
suffers, so I did global 32-bit id in this RFC
, but this indeed doesn't work

Really?  That will mean that you have to edit the source of your
filter program if the small integer map number you chose conflicts
with another program.  That sounds unpleasant.

unpleasant. exactly. that's why I'm proposing per-process local map-id,
so that programs don't need to be edited.

quoted

for unprivileged chrome browser unless programs are previously loaded
by root and chrome only does attach to seccomp.

So here is the non-root bpf syscall interface I'm thinking about:

ufd = bpf_create_map(map_id, key_size, value_size, max_entries);

it will create a global map in the system which will be accessible
in this process via 'ufd'. Internally this 'ufd' will be assigned global map_id
and process-local map_id that was passed as a 1st argument.
To do update/lookup the process will use bpf_map_xxx_elem(ufd,…)

Erk.  Unprivileged programs shouldn't be able to allocate global ids
of their choosing, especially if privileged programs can also do it.
Otherwise unprivileged programs can force a collision and possibly
steal information.

of course. that's not what said.

quoted

Then to load eBPF program the process will do:
ufd = bpf_prog_load(prog_type, ebpf_insn_array, license)
and instructions will be referring to maps via local map_id that
was hard coded as part of the program.

I think relocations would be must prettier than per-process map id tables.

I think per process map_id are much cleaner.

non-root API:

ufd = bpf_create_map(local_map_id,… )
bpf_map_update/delete/lookup_elem(ufd,…)
ufd = bpf_prog_load(insns)
close(ufd)

root only API:

global_id = bpf_get_id(ufd) // returns either map or prog global id
bpf_map_delete(global_map_id)
bpf_prog_unload(global_prog_id)

Details:

ufd = bpf_create_map(local_map_id, ...);

local_map_id - process local map_id
(this id is used to access maps from eBPF program loaded by this process)

ufd - process local file descriptor
(used to update/lookup maps from this process)

global_map_id = bpf_get_id(ufd)
this is root only call to get global_ids and pass them to global events
like tracing.

global ids will only be seen by root. There is no way for root or non-root
to influence id ranges.

quoted

Beyond the normal create_map, update/lookup/delete, load_prog
operations (that are accessible to both root and non-root), the root user
gains one more operations: bpf_get_global_id(ufd) that returns
global map_id or prog_id. This id can be attached to global events
like tracing. Non-root users lose ability to do delete_map and
unload_prog (they do close(ufd) instead), so this ops are for root
only and operate on global ids.
This is the cleanest way I could think of to combine non-root
security, per-process id and global id all in one API. Thoughts?

I think I'm okay with this part, although an interface to get a map fd
given some reference to the program (in sysfs) that uses it would also
work and maybe be more straightforward.

If you meant debugfs, then yes. I'm planning to add a way for root
to see all loaded programs and maps (similar to /proc as lsmod does),
and then do bpf_map_delete/bpf_prog_unload (similar to rmmod)

setsockopt and seccomp will be non-root and programs will go
through additional dont_leak_pointers check in verifier.
tracing/dtrace will be for root only, since they would need to attach
to global events.

I think it will be cleaner once I finish fd conversion as a patch.

OK

FWIW, per-process local id maps sound almost equivalent to relocations
-- the latter could be as simple as an extra nlattr giving a list of
pairs of (per-eBPF-program id, fd).

My current binutils mess is mainly just because I'm trying to do
something weird with an old, old file format that needs to support
lots of legacy tools.  You won't have that problem.

--Andy

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help