Re: [RFC PATCH v7 0/8] net_sched: Introduce eBPF based Qdisc

[RFC PATCH v7 0/8] net_sched: Introduce eBPF based Qdisc · Amery Hung <hidden> · 2024-01-17
[RFC PATCH v7 1/8] net_sched: Introduce eBPF based Qdisc · Amery Hung <hidden> · 2024-01-17
Re: [RFC PATCH v7 1/8] net_sched: Introduce eBPF based Qdisc · Martin KaFai Lau <martin.lau@linux.dev> · 2024-01-23
Re: [RFC PATCH v7 1/8] net_sched: Introduce eBPF based Qdisc · Amery Hung <hidden> · 2024-01-24
Re: [RFC PATCH v7 1/8] net_sched: Introduce eBPF based Qdisc · Martin KaFai Lau <martin.lau@linux.dev> · 2024-01-26
Re: [RFC PATCH v7 1/8] net_sched: Introduce eBPF based Qdisc · Amery Hung <hidden> · 2024-01-27
Re: [RFC PATCH v7 1/8] net_sched: Introduce eBPF based Qdisc · Martin KaFai Lau <martin.lau@linux.dev> · 2024-01-30
Re: [RFC PATCH v7 1/8] net_sched: Introduce eBPF based Qdisc · Kui-Feng Lee <hidden> · 2024-01-30
Re: [RFC PATCH v7 1/8] net_sched: Introduce eBPF based Qdisc · Martin KaFai Lau <martin.lau@linux.dev> · 2024-01-31
Re: [RFC PATCH v7 1/8] net_sched: Introduce eBPF based Qdisc · Kui-Feng Lee <hidden> · 2024-01-31
Re: [RFC PATCH v7 1/8] net_sched: Introduce eBPF based Qdisc · Amery Hung <hidden> · 2024-01-31
Re: [RFC PATCH v7 1/8] net_sched: Introduce eBPF based Qdisc · Amery Hung <hidden> · 2024-01-31
Re: [RFC PATCH v7 1/8] net_sched: Introduce eBPF based Qdisc · Martin KaFai Lau <martin.lau@linux.dev> · 2024-02-02
Re: [RFC PATCH v7 1/8] net_sched: Introduce eBPF based Qdisc · Amery Hung <hidden> · 2024-02-09
[RFC PATCH v7 2/8] net_sched: Add kfuncs for working with skb · Amery Hung <hidden> · 2024-01-17
[RFC PATCH v7 3/8] net_sched: Introduce kfunc bpf_skb_tc_classify() · Amery Hung <hidden> · 2024-01-17
[RFC PATCH v7 4/8] net_sched: Add reset program · Amery Hung <hidden> · 2024-01-17
[RFC PATCH v7 5/8] net_sched: Add init program · Amery Hung <hidden> · 2024-01-17
[RFC PATCH v7 6/8] tools/libbpf: Add support for BPF_PROG_TYPE_QDISC · Amery Hung <hidden> · 2024-01-17
Re: [RFC PATCH v7 6/8] tools/libbpf: Add support for BPF_PROG_TYPE_QDISC · Andrii Nakryiko <hidden> · 2024-01-23
Re: [RFC PATCH v7 6/8] tools/libbpf: Add support for BPF_PROG_TYPE_QDISC · Amery Hung <hidden> · 2024-01-23
[RFC PATCH v7 7/8] samples/bpf: Add an example of bpf fq qdisc · Amery Hung <hidden> · 2024-01-17
Re: [RFC PATCH v7 7/8] samples/bpf: Add an example of bpf fq qdisc · Daniel Borkmann <daniel@iogearbox.net> · 2024-01-24
Re: [RFC PATCH v7 7/8] samples/bpf: Add an example of bpf fq qdisc · Amery Hung <hidden> · 2024-01-26
[RFC PATCH v7 8/8] samples/bpf: Add an example of bpf netem qdisc · Amery Hung <hidden> · 2024-01-17
Re: [RFC PATCH v7 0/8] net_sched: Introduce eBPF based Qdisc · Stanislav Fomichev <hidden> · 2024-01-23
Re: [RFC PATCH v7 0/8] net_sched: Introduce eBPF based Qdisc · Daniel Borkmann <daniel@iogearbox.net> · 2024-01-24
Re: [RFC PATCH v7 0/8] net_sched: Introduce eBPF based Qdisc · Jamal Hadi Salim <jhs@mojatatu.com> · 2024-01-24
Re: [RFC PATCH v7 0/8] net_sched: Introduce eBPF based Qdisc · Daniel Borkmann <daniel@iogearbox.net> · 2024-01-24
Re: [RFC PATCH v7 0/8] net_sched: Introduce eBPF based Qdisc · Jamal Hadi Salim <jhs@mojatatu.com> · 2024-01-24
Re: [RFC PATCH v7 0/8] net_sched: Introduce eBPF based Qdisc · Daniel Borkmann <daniel@iogearbox.net> · 2024-01-24
Re: [RFC PATCH v7 0/8] net_sched: Introduce eBPF based Qdisc · Amery Hung <hidden> · 2024-01-24
Re: [RFC PATCH v7 0/8] net_sched: Introduce eBPF based Qdisc · Daniel Borkmann <daniel@iogearbox.net> · 2024-01-25

From: Daniel Borkmann <daniel@iogearbox.net>
Date: 2024-01-24 10:10:49
Also in: bpf

On 1/23/24 10:13 PM, Stanislav Fomichev wrote:

On 01/17, Amery Hung wrote:

quoted

Hi,

I am continuing the work of ebpf-based Qdisc based on Cong’s previous
RFC. The followings are some use cases of eBPF Qdisc:

1. Allow customizing Qdiscs in an easier way. So that people don't
    have to write a complete Qdisc kernel module just to experiment
    some new queuing theory.

2. Solve EDT's problem. EDT calcuates the "tokens" in clsact which
    is before enqueue, it is impossible to adjust those "tokens" after
    packets get dropped in enqueue. With eBPF Qdisc, it is easy to
    be solved with a shared map between clsact and sch_bpf.

3. Replace qevents, as now the user gains much more control over the
    skb and queues.

4. Provide a new way to reuse TC filters. Currently TC relies on filter
    chain and block to reuse the TC filters, but they are too complicated
    to understand. With eBPF helper bpf_skb_tc_classify(), we can invoke
    TC filters on _any_ Qdisc (even on a different netdev) to do the
    classification.

5. Potentially pave a way for ingress to queue packets, although
    current implementation is still only for egress.

I’ve combed through previous comments and appreciated the feedbacks.
Some major changes in this RFC is the use of kptr to skb to maintain
the validility of skb during its lifetime in the Qdisc, dropping rbtree
maps, and the inclusion of two examples.

Some questions for discussion:

1. We now pass a trusted kptr of sk_buff to the program instead of
    __sk_buff. This makes most helpers using __sk_buff incompatible
    with eBPF qdisc. An alternative is to still use __sk_buff in the
    context and use bpf_cast_to_kern_ctx() to acquire the kptr. However,
    this can only be applied to enqueue program, since in dequeue program
    skbs do not come from ctx but kptrs exchanged out of maps (i.e., there
    is no __sk_buff). Any suggestion for making skb kptr and helper
    functions compatible?

2. The current patchset uses netlink. Do we also want to use bpf_link
    for attachment?

[..]

quoted

3. People have suggested struct_ops. We chose not to use struct_ops since
    users might want to create multiple bpf qdiscs with different
    implementations. Current struct_ops attachment model does not seem
    to support replacing only functions of a specific instance of a module,
    but I might be wrong.

I still feel like it deserves at leasta try. Maybe we can find some potential
path where struct_ops can allow different implementations (Martin probably
has some ideas about that). I looked at the bpf qdisc itself and it doesn't
really have anything complicated (besides trying to play nicely with other
tc classes/actions, but I'm not sure how relevant that is).

Plus it's also not used in the two sample implementations, given you can
implement this as part of the enqueue operation in bpf. It would make sense
to drop the kfunc from the set. One other note.. the BPF samples have been
bitrotting for quite a while, please convert this into a proper BPF selftest
so that BPF CI can run this.

With struct_ops you can also get your (2) addressed.

+1

Thanks,
Daniel

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help