Re: [RFC Patch bpf-next] bpf: introduce bpf timer

[RFC Patch bpf-next] bpf: introduce bpf timer · Cong Wang <hidden> · 2021-04-01
Re: [RFC Patch bpf-next] bpf: introduce bpf timer · Song Liu <hidden> · 2021-04-01
Re: [RFC Patch bpf-next] bpf: introduce bpf timer · Cong Wang <hidden> · 2021-04-01
Re: [RFC Patch bpf-next] bpf: introduce bpf timer · Song Liu <hidden> · 2021-04-01
Re: [RFC Patch bpf-next] bpf: introduce bpf timer · Cong Wang <hidden> · 2021-04-02
Re: [RFC Patch bpf-next] bpf: introduce bpf timer · Song Liu <hidden> · 2021-04-02
Re: [RFC Patch bpf-next] bpf: introduce bpf timer · Cong Wang <hidden> · 2021-04-02
Re: [RFC Patch bpf-next] bpf: introduce bpf timer · Song Liu <hidden> · 2021-04-02
Re: [RFC Patch bpf-next] bpf: introduce bpf timer · Cong Wang <hidden> · 2021-04-02
Re: [RFC Patch bpf-next] bpf: introduce bpf timer · Song Liu <hidden> · 2021-04-02
Re: [RFC Patch bpf-next] bpf: introduce bpf timer · Cong Wang <hidden> · 2021-04-05
Re: [RFC Patch bpf-next] bpf: introduce bpf timer · Song Liu <hidden> · 2021-04-06
Re: [RFC Patch bpf-next] bpf: introduce bpf timer · Cong Wang <hidden> · 2021-04-06
Re: [RFC Patch bpf-next] bpf: introduce bpf timer · Song Liu <hidden> · 2021-04-06
Re: [RFC Patch bpf-next] bpf: introduce bpf timer · Cong Wang <hidden> · 2021-04-06
Re: [RFC Patch bpf-next] bpf: introduce bpf timer · Song Liu <hidden> · 2021-04-06
Re: [RFC Patch bpf-next] bpf: introduce bpf timer · Cong Wang <hidden> · 2021-04-08
Re: [RFC Patch bpf-next] bpf: introduce bpf timer · Alexei Starovoitov <hidden> · 2021-04-02
Re: [RFC Patch bpf-next] bpf: introduce bpf timer · Cong Wang <hidden> · 2021-04-02
Re: [RFC Patch bpf-next] bpf: introduce bpf timer · Alexei Starovoitov <hidden> · 2021-04-02
Re: [RFC Patch bpf-next] bpf: introduce bpf timer · Cong Wang <hidden> · 2021-04-06
Re: [RFC Patch bpf-next] bpf: introduce bpf timer · Alexei Starovoitov <hidden> · 2021-04-12
Re: [RFC Patch bpf-next] bpf: introduce bpf timer · Cong Wang <hidden> · 2021-04-15
Re: [RFC Patch bpf-next] bpf: introduce bpf timer · Alexei Starovoitov <hidden> · 2021-04-15
Re: [RFC Patch bpf-next] bpf: introduce bpf timer · Cong Wang <hidden> · 2021-04-15
Re: [RFC Patch bpf-next] bpf: introduce bpf timer · Cong Wang <hidden> · 2021-04-26
Re: [RFC Patch bpf-next] bpf: introduce bpf timer · Alexei Starovoitov <hidden> · 2021-04-26
Re: [RFC Patch bpf-next] bpf: introduce bpf timer · Cong Wang <hidden> · 2021-04-26
Re: [RFC Patch bpf-next] bpf: introduce bpf timer · Alexei Starovoitov <hidden> · 2021-04-27
Re: [RFC Patch bpf-next] bpf: introduce bpf timer · Jamal Hadi Salim <jhs@mojatatu.com> · 2021-04-27
Re: [RFC Patch bpf-next] bpf: introduce bpf timer · Cong Wang <hidden> · 2021-04-27
Re: [RFC Patch bpf-next] bpf: introduce bpf timer · Alexei Starovoitov <hidden> · 2021-04-27
Re: [RFC Patch bpf-next] bpf: introduce bpf timer · Cong Wang <hidden> · 2021-05-09
Re: [RFC Patch bpf-next] bpf: introduce bpf timer · Jamal Hadi Salim <jhs@mojatatu.com> · 2021-05-10
Re: [RFC Patch bpf-next] bpf: introduce bpf timer · Cong Wang <hidden> · 2021-05-11
Re: [RFC Patch bpf-next] bpf: introduce bpf timer · Jamal Hadi Salim <jhs@mojatatu.com> · 2021-05-12
Re: [RFC Patch bpf-next] bpf: introduce bpf timer · Joe Stringer <hidden> · 2021-05-11
Re: [RFC Patch bpf-next] bpf: introduce bpf timer · Cong Wang <hidden> · 2021-05-11
Re: [RFC Patch bpf-next] bpf: introduce bpf timer · Jamal Hadi Salim <jhs@mojatatu.com> · 2021-05-12
Re: [RFC Patch bpf-next] bpf: introduce bpf timer · Jamal Hadi Salim <jhs@mojatatu.com> · 2021-05-13
Re: [RFC Patch bpf-next] bpf: introduce bpf timer · Cong Wang <hidden> · 2021-05-14
Re: [RFC Patch bpf-next] bpf: introduce bpf timer · Joe Stringer <hidden> · 2021-08-11

From: Alexei Starovoitov <hidden>
Date: 2021-04-02 23:45:06
Also in: bpf

On Fri, Apr 02, 2021 at 02:24:51PM -0700, Cong Wang wrote:

quoted

where the key is the timer ID and the value is the timer expire
timer.

The timer ID is unnecessary. We cannot introduce new IDR for every new
bpf object. It doesn't scale.

The IDR is per map, not per timer.

Per-map is not acceptable. One IDR for all maps with timers is not acceptable either.
We have 3 IDRs now: for progs, for maps, and for links.
No other objects need IDRs.

quoted

Here is how more general timers might look like:
https://lore.kernel.org/bpf/20210310011905.ozz4xahpkqbfkkvd@ast-mbp.dhcp.thefacebook.com/ (local)

include/uapi/linux/bpf.h:
struct bpf_timer {
  u64 opaque;
};
The 'opaque' field contains a pointer to dynamically allocated struct timer_list and other data.

This is my initial design as we already discussed, it does not work,
please see below.

It does work. The perceived "issue" you referred to is a misunderstanding. See below.

quoted

The prog would do:
struct map_elem {
    int stuff;
    struct bpf_timer timer;
};

struct {
    __uint(type, BPF_MAP_TYPE_HASH);
    __uint(max_entries, 1);
    __type(key, int);
    __type(value, struct map_elem);
} hmap SEC(".maps");

static int timer_cb(struct map_elem *elem)
{
    if (whatever && elem->stuff)
        bpf_timer_mod(&elem->timer, new_expire);
}

int bpf_timer_test(...)
{
    struct map_elem *val;

    val = bpf_map_lookup_elem(&hmap, &key);
    if (val) {
        bpf_timer_init(&val->timer, timer_cb, flags);
        val->stuff = 123;
        bpf_timer_mod(&val->timer, expires);
    }
}

bpf_map_update_elem() either from bpf prog or from user space
allocates map element and zeros 8 byte space for the timer pointer.
bpf_timer_init() allocates timer_list and stores it into opaque if opaque == 0.
The validation of timer_cb() is done by the verifier.
bpf_map_delete_elem() either from bpf prog or from user space
does del_timer() if elem->opaque != 0.
If prog refers such hmap as above during prog free the kernel does
for_each_map_elem {if (elem->opaque) del_timer().}
I think that is the simplest way of prevent timers firing past the prog life time.
There could be other ways to solve it (like prog_array and ref/uref).

Pseudo code:
int bpf_timer_init(struct bpf_timer *timer, void *timer_cb, int flags)
{
  if (timer->opaque)
    return -EBUSY;
  t = alloc timer_list
  t->cb = timer_cb;
  t->..
  timer->opaque = (long)t;
}

int bpf_timer_mod(struct bpf_timer *timer, u64 expires)
{
  if (!time->opaque)
    return -EINVAL;
  t = (struct timer_list *)timer->opaque;
  mod_timer(t,..);
}

int bpf_timer_del(struct bpf_timer *timer)
{
  if (!time->opaque)
    return -EINVAL;
  t = (struct timer_list *)timer->opaque;
  del_timer(t);
}

The verifier would need to check that 8 bytes occupied by bpf_timer and not accessed
via load/store by the program. The same way it does it for bpf_spin_lock.

This does not work, because bpf_timer_del() has to be matched
with bpf_timer_init(), otherwise we would leak timer resources.
For example:

SEC("foo")
bad_ebpf_code()
{
  struct bpf_timer t;
  bpf_timer_init(&t, ...); // allocate a timer
  bpf_timer_mod(&t, ..);
  // end of BPF program
  // now the timer is leaked, no one will delete it
}

We can not enforce the matching in the verifier, because users would
have to call bpf_timer_del() before exiting, which is not what we want
either.

bad_ebpf_code()
{
  struct bpf_timer t;

is not at all what was proposed. This kind of code will be rejected by the verifier.

'struct bpf_timer' has to be part of the map element and the verifier will enforce that
just like it does so for bpf_spin_lock.
Try writing the following program:

bad_ebpf_code()
{
  struct bpf_spin_lock t;
  bpf_spin_lock(&t);
}
``
and then follow the code to see why the verifier rejects it.

The implementation of what I'm proposing is straightforward.
I certainly understand that it might look intimidating and "impossible",
but it's really quite simple.

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help