Re: [PATCH RFC v1 net-next 1/4] net: Introduce Qdisc backpressure infrastructure

[PATCH RFC v1 net-next 0/4] net: Qdisc backpressure infrastructure · Peilin Ye <hidden> · 2022-05-06
[PATCH RFC v1 net-next 1/4] net: Introduce Qdisc backpressure infrastructure · Peilin Ye <hidden> · 2022-05-06
Re: [PATCH RFC v1 net-next 1/4] net: Introduce Qdisc backpressure infrastructure · Stephen Hemminger <stephen@networkplumber.org> · 2022-05-06
Re: [PATCH RFC v1 net-next 1/4] net: Introduce Qdisc backpressure infrastructure · Peilin Ye <hidden> · 2022-05-06
Re: [PATCH RFC v1 net-next 1/4] net: Introduce Qdisc backpressure infrastructure · Dave Taht <hidden> · 2022-05-09
Re: [PATCH RFC v1 net-next 1/4] net: Introduce Qdisc backpressure infrastructure · Peilin Ye <hidden> · 2022-05-10
[PATCH RFC v1 net-next 2/4] net/sched: sch_tbf: Use Qdisc backpressure infrastructure · Peilin Ye <hidden> · 2022-05-06
[PATCH RFC v1 net-next 3/4] net/sched: sch_htb: Use Qdisc backpressure infrastructure · Peilin Ye <hidden> · 2022-05-06
[PATCH RFC v1 net-next 4/4] net/sched: sch_cbq: Use Qdisc backpressure infrastructure · Peilin Ye <hidden> · 2022-05-06
Re: [PATCH RFC v1 net-next 0/4] net: Qdisc backpressure infrastructure · Eric Dumazet <hidden> · 2022-05-10
Re: [PATCH RFC v1 net-next 0/4] net: Qdisc backpressure infrastructure · Peilin Ye <hidden> · 2022-05-10
Re: [PATCH RFC v1 net-next 0/4] net: Qdisc backpressure infrastructure · Peilin Ye <hidden> · 2022-05-10
[PATCH RFC v2 net-next 0/5] net: Qdisc backpressure infrastructure · Peilin Ye <hidden> · 2022-08-22
[PATCH RFC v2 net-next 1/5] net: Introduce Qdisc backpressure infrastructure · Peilin Ye <hidden> · 2022-08-22
[PATCH RFC v2 net-next 2/5] net/udp: Implement Qdisc backpressure algorithm · Peilin Ye <hidden> · 2022-08-22
[PATCH RFC v2 net-next 3/5] net/sched: sch_tbf: Use Qdisc backpressure infrastructure · Peilin Ye <hidden> · 2022-08-22
[PATCH RFC v2 net-next 4/5] net/sched: sch_htb: Use Qdisc backpressure infrastructure · Peilin Ye <hidden> · 2022-08-22
[PATCH RFC v2 net-next 5/5] net/sched: sch_cbq: Use Qdisc backpressure infrastructure · Peilin Ye <hidden> · 2022-08-22
Re: [PATCH RFC v2 net-next 0/5] net: Qdisc backpressure infrastructure · Jakub Kicinski <kuba@kernel.org> · 2022-08-22
Re: [PATCH RFC v2 net-next 0/5] net: Qdisc backpressure infrastructure · Cong Wang <hidden> · 2022-08-29
Re: [PATCH RFC v2 net-next 0/5] net: Qdisc backpressure infrastructure · Jakub Kicinski <kuba@kernel.org> · 2022-08-30
Re: [PATCH RFC v2 net-next 0/5] net: Qdisc backpressure infrastructure · Cong Wang <hidden> · 2022-09-19
Re: [PATCH RFC v2 net-next 0/5] net: Qdisc backpressure infrastructure · Eric Dumazet <edumazet@google.com> · 2022-08-22
Re: [PATCH RFC v2 net-next 0/5] net: Qdisc backpressure infrastructure · Cong Wang <hidden> · 2022-08-29
Re: [PATCH RFC v2 net-next 0/5] net: Qdisc backpressure infrastructure · Eric Dumazet <edumazet@google.com> · 2022-08-29
Re: [PATCH RFC v2 net-next 0/5] net: Qdisc backpressure infrastructure · Cong Wang <hidden> · 2022-09-19
Re: [PATCH RFC v2 net-next 0/5] net: Qdisc backpressure infrastructure · Yafang Shao <hidden> · 2022-08-30
Re: [PATCH RFC v2 net-next 0/5] net: Qdisc backpressure infrastructure · Cong Wang <hidden> · 2022-09-19

From: Peilin Ye <hidden>
Date: 2022-05-10 02:24:09
Also in: lkml

Hi Dave,

On Mon, May 09, 2022 at 12:53:28AM -0700, Dave Taht wrote:

I am very pleased to see this work.

Thanks!

However,  my "vision" such as it was, and as misguided as it might be,
was to implement a facility similar to tcp_notsent_lowat for udp
packets, tracking the progress of the udp packet through the kernel,
and supplying backpressure and providing better information about
where when and why the packet was dropped in the stack back to the
application.

By "a facility similar to tcp_notsent_lowat", do you mean a smaller
sk_sndbuf, or "UDP Small Queues"?

I don't fully understand the implications of using a smaller sk_sndbuf
yet, but I think it can work together with this RFC.

sk_sndbuf is a per-socket attribute, while this RFC tries to improve it
from Qdisc's perspective.  Using a smaller sk_sndbuf alone does not
prevent the "when UDP sends faster, TBF simply drops faster" issue
(described in [I] of the cover letter) from happening.  There's always a
point, that there're too many sockets, so TBF's queue cannot contain
"sk_sndbuf times number of sockets" (roughly speaking) bytes of skbs.
After that point, TBF will suddenly start to drop a lot.

For example, I used the default 212992 sk_sndbuf
(/proc/sys/net/core/wmem_default) in the test setup ([V] in the cover
letter).  Let's make it one tenth as large, 21299.  It works well for
the 2-client setup; zero packets dropped.  However, if we test it with
15 iperf2 clients:

[  3]  0.0-30.0 sec  46.4 MBytes  13.0 Mbits/sec   1.251 ms 89991/123091 (73%)
[  3]  0.0-30.0 sec  46.6 MBytes  13.0 Mbits/sec   2.033 ms 91204/124464 (73%)
[  3]  0.0-30.0 sec  46.5 MBytes  13.0 Mbits/sec   0.504 ms 89879/123054 (73%)
<...>                                                       ^^^^^^^^^^^^ ^^^^^

73% drop rate again.  Now apply this RFC:

[  3]  0.0-30.0 sec  46.3 MBytes  12.9 Mbits/sec   1.206 ms  807/33839 (2.4%)
[  3]  0.0-30.0 sec  45.5 MBytes  12.7 Mbits/sec   1.919 ms  839/33283 (2.5%)
[  3]  0.0-30.0 sec  45.8 MBytes  12.8 Mbits/sec   2.521 ms  837/33508 (2.5%)
<...>                                                        ^^^^^^^^^ ^^^^^^

Down to 3% again.

Next, same 21299 sk_sndbuf, 20 iperf2 clients, without RFC:

[  3]  0.0-30.0 sec  34.5 MBytes  9.66 Mbits/sec   1.054 ms 258703/283342 (91%)
[  3]  0.0-30.0 sec  34.5 MBytes  9.66 Mbits/sec   1.033 ms 257324/281964 (91%)
[  3]  0.0-30.0 sec  34.5 MBytes  9.66 Mbits/sec   1.116 ms 257858/282500 (91%)
<...>                                                       ^^^^^^^^^^^^^ ^^^^^

91% drop rate.  Finally, apply RFC:

[  3]  0.0-30.0 sec  34.4 MBytes  9.61 Mbits/sec   0.974 ms 7982/32503 (25%)
[  3]  0.0-30.0 sec  34.1 MBytes  9.54 Mbits/sec   1.381 ms 7394/31732 (23%)
[  3]  0.0-30.0 sec  34.3 MBytes  9.58 Mbits/sec   2.431 ms 8149/32583 (25%)
<...>                                                       ^^^^^^^^^^ ^^^^^

The thundering herd probelm ([III] in the cover letter) surfaces, but
still an improvement.

In conclusion, assuming we are going to use smaller sk_sndbuf or "UDP
Small Queues", I think it doesn't replace this RFC, and vice versa.

I've been really impressed by the DROP_REASON work and had had no clue
prior to seeing all that instrumentation, where else packets might be
dropped in the kernel.

I'd be interested to see what happens with sch_cake.

Sure, I will cover sch_cake in v2.

Thanks,
Peilin Ye

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help