Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase

[RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Lorenzo Bianconi <lorenzo@kernel.org> · 2024-09-16
[RFC/RFT v2 1/3] net: Add napi_init_for_gro routine · Lorenzo Bianconi <lorenzo@kernel.org> · 2024-09-16
[RFC/RFT v2 2/3] net: add napi_threaded_poll to netdevice.h · Lorenzo Bianconi <lorenzo@kernel.org> · 2024-09-16
[RFC/RFT v2 3/3] bpf: cpumap: Add gro support · Lorenzo Bianconi <lorenzo@kernel.org> · 2024-09-16
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Alexander Lobakin <aleksander.lobakin@intel.com> · 2024-09-16
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Daniel Xu <hidden> · 2024-10-08
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Lorenzo Bianconi <lorenzo@kernel.org> · 2024-10-09
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Alexander Lobakin <aleksander.lobakin@intel.com> · 2024-10-09
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Lorenzo Bianconi <lorenzo@kernel.org> · 2024-10-09
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Alexander Lobakin <aleksander.lobakin@intel.com> · 2024-10-09
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Alexander Lobakin <aleksander.lobakin@intel.com> · 2024-10-22
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Alexander Lobakin <aleksander.lobakin@intel.com> · 2024-11-12
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Daniel Xu <hidden> · 2024-11-13
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Daniel Xu <hidden> · 2024-11-23
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Alexander Lobakin <aleksander.lobakin@intel.com> · 2024-11-25
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Daniel Xu <hidden> · 2024-11-25
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Jesper Dangaard Brouer <hawk@kernel.org> · 2024-11-25
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Daniel Xu <hidden> · 2024-11-25
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Lorenzo Bianconi <hidden> · 2024-11-25
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Daniel Xu <hidden> · 2024-11-25
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Alexander Lobakin <aleksander.lobakin@intel.com> · 2024-11-26
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Lorenzo Bianconi <hidden> · 2024-11-26
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Jesper Dangaard Brouer <hawk@kernel.org> · 2024-11-26
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Alexander Lobakin <aleksander.lobakin@intel.com> · 2024-11-28
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Lorenzo Bianconi <lorenzo@kernel.org> · 2024-11-28
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Alexander Lobakin <aleksander.lobakin@intel.com> · 2024-11-28
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Jakub Kicinski <kuba@kernel.org> · 2024-12-02
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Alexander Lobakin <aleksander.lobakin@intel.com> · 2024-12-03
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Jakub Kicinski <kuba@kernel.org> · 2024-12-04
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Alexander Lobakin <aleksander.lobakin@intel.com> · 2024-12-04
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Daniel Xu <hidden> · 2024-12-04
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Alexander Lobakin <aleksander.lobakin@intel.com> · 2024-12-05
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Alexander Lobakin <aleksander.lobakin@intel.com> · 2024-12-05
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Daniel Xu <hidden> · 2024-12-06
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Alexander Lobakin <aleksander.lobakin@intel.com> · 2024-12-06
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Daniel Xu <hidden> · 2024-12-06

From: Alexander Lobakin <aleksander.lobakin@intel.com>
Date: 2024-10-22 15:52:14
Also in: bpf

From: Alexander Lobakin <aleksander.lobakin@intel.com>
Date: Wed, 9 Oct 2024 14:50:42 +0200

From: Lorenzo Bianconi <lorenzo@kernel.org>
Date: Wed, 9 Oct 2024 14:47:58 +0200

quoted

From: Lorenzo Bianconi <lorenzo@kernel.org>
Date: Wed, 9 Oct 2024 12:46:00 +0200

quoted

Hi Lorenzo,

On Mon, Sep 16, 2024 at 12:13:42PM GMT, Lorenzo Bianconi wrote:

quoted

Add GRO support to cpumap codebase moving the cpu_map_entry kthread to a
NAPI-kthread pinned on the selected cpu.

Changes in rfc v2:
- get rid of dummy netdev dependency

Lorenzo Bianconi (3):
  net: Add napi_init_for_gro routine
  net: add napi_threaded_poll to netdevice.h
  bpf: cpumap: Add gro support

 include/linux/netdevice.h |   3 +
 kernel/bpf/cpumap.c       | 123 ++++++++++++++++----------------------
 net/core/dev.c            |  27 ++++++---
 3 files changed, 73 insertions(+), 80 deletions(-)

-- 
2.46.0

Sorry about the long delay - finally caught up to everything after
conferences.

I re-ran my synthetic tests (including baseline). v2 is somehow showing
2x bigger gains than v1 (~30% vs ~14%) for tcp_stream. Again, the only
variable I changed is kernel version - steering prog is active for both.


Baseline (again)							

./tcp_rr -c -H $TASK_IP -p 50,90,99 -T4 -F8 -l30			        ./tcp_stream -c -H $TASK_IP -T8 -F16 -l30
							
	Transactions	Latency P50 (s)	Latency P90 (s)	Latency P99 (s)			Throughput (Mbit/s)
Run 1	2560252	        0.00009087	0.00010495	0.00011647		Run 1	15479.31
Run 2	2665517	        0.00008575	0.00010239	0.00013311		Run 2	15162.48
Run 3	2755939	        0.00008191	0.00010367	0.00012287		Run 3	14709.04
Run 4	2595680	        0.00008575	0.00011263	0.00012671		Run 4	15373.06
Run 5	2841865	        0.00007999	0.00009471	0.00012799		Run 5	15234.91
Average	2683850.6	0.000084854	0.00010367	0.00012543		Average	15191.76
							
cpumap NAPI patches v2							
							
	Transactions	Latency P50 (s)	Latency P90 (s)	Latency P99 (s)			Throughput (Mbit/s)
Run 1	2577838	        0.00008575	0.00012031	0.00013695		Run 1	19914.56
Run 2	2729237	        0.00007551	0.00013311	0.00017663		Run 2	20140.92
Run 3	2689442	        0.00008319	0.00010495	0.00013311		Run 3	19887.48
Run 4	2862366	        0.00008127	0.00009471	0.00010623		Run 4	19374.49
Run 5	2700538	        0.00008319	0.00010367	0.00012799		Run 5	19784.49
Average	2711884.2	0.000081782	0.00011135	0.000136182		Average	19820.388
Delta	1.04%	        -3.62%	        7.41%	        8.57%			        30.47%

Thanks,
Daniel

Hi Daniel,

cool, thx for testing it.

@Olek: how do we want to proceed on it? Are you still working on it or do you want me
to send a regular patch for it?

Hi,

I had a small vacation, sorry. I'm starting working on it again today.

ack, no worries. Are you going to rebase the other patches on top of it
or are you going to try a different approach?

I'll try the approach without NAPI as Kuba asks and let Daniel test it,
then we'll see.

For now, I have the same results without NAPI as with your series, so
I'll push it soon and let Daniel test.

(I simply decoupled GRO and NAPI and used the former in cpumap, but the
 kthread logic didn't change)

BTW I'm curious how he got this boost on v2, from what I see you didn't
change the implementation that much?

Thanks,
Olek

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help