Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase

[RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Lorenzo Bianconi <lorenzo@kernel.org> · 2024-09-16
[RFC/RFT v2 1/3] net: Add napi_init_for_gro routine · Lorenzo Bianconi <lorenzo@kernel.org> · 2024-09-16
[RFC/RFT v2 2/3] net: add napi_threaded_poll to netdevice.h · Lorenzo Bianconi <lorenzo@kernel.org> · 2024-09-16
[RFC/RFT v2 3/3] bpf: cpumap: Add gro support · Lorenzo Bianconi <lorenzo@kernel.org> · 2024-09-16
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Alexander Lobakin <aleksander.lobakin@intel.com> · 2024-09-16
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Daniel Xu <hidden> · 2024-10-08
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Lorenzo Bianconi <lorenzo@kernel.org> · 2024-10-09
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Alexander Lobakin <aleksander.lobakin@intel.com> · 2024-10-09
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Lorenzo Bianconi <lorenzo@kernel.org> · 2024-10-09
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Alexander Lobakin <aleksander.lobakin@intel.com> · 2024-10-09
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Alexander Lobakin <aleksander.lobakin@intel.com> · 2024-10-22
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Alexander Lobakin <aleksander.lobakin@intel.com> · 2024-11-12
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Daniel Xu <hidden> · 2024-11-13
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Daniel Xu <hidden> · 2024-11-23
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Alexander Lobakin <aleksander.lobakin@intel.com> · 2024-11-25
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Daniel Xu <hidden> · 2024-11-25
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Jesper Dangaard Brouer <hawk@kernel.org> · 2024-11-25
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Daniel Xu <hidden> · 2024-11-25
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Lorenzo Bianconi <hidden> · 2024-11-25
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Daniel Xu <hidden> · 2024-11-25
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Alexander Lobakin <aleksander.lobakin@intel.com> · 2024-11-26
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Lorenzo Bianconi <hidden> · 2024-11-26
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Jesper Dangaard Brouer <hawk@kernel.org> · 2024-11-26
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Alexander Lobakin <aleksander.lobakin@intel.com> · 2024-11-28
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Lorenzo Bianconi <lorenzo@kernel.org> · 2024-11-28
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Alexander Lobakin <aleksander.lobakin@intel.com> · 2024-11-28
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Jakub Kicinski <kuba@kernel.org> · 2024-12-02
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Alexander Lobakin <aleksander.lobakin@intel.com> · 2024-12-03
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Jakub Kicinski <kuba@kernel.org> · 2024-12-04
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Alexander Lobakin <aleksander.lobakin@intel.com> · 2024-12-04
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Daniel Xu <hidden> · 2024-12-04
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Alexander Lobakin <aleksander.lobakin@intel.com> · 2024-12-05
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Alexander Lobakin <aleksander.lobakin@intel.com> · 2024-12-05
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Daniel Xu <hidden> · 2024-12-06
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Alexander Lobakin <aleksander.lobakin@intel.com> · 2024-12-06
Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase · Daniel Xu <hidden> · 2024-12-06

From: Lorenzo Bianconi <hidden>
Date: 2024-11-26 17:03:05
Also in: bpf

From: Daniel Xu <redacted>
Date: Mon, 25 Nov 2024 16:56:49 -0600

quoted


On Mon, Nov 25, 2024, at 9:12 AM, Alexander Lobakin wrote:

quoted

From: Daniel Xu <redacted>
Date: Fri, 22 Nov 2024 17:10:06 -0700

quoted

Hi Olek,

Here are the results.

On Wed, Nov 13, 2024 at 03:39:13PM GMT, Daniel Xu wrote:

quoted


On Tue, Nov 12, 2024, at 9:43 AM, Alexander Lobakin wrote:

[...]

quoted

Baseline (again)

	Transactions	Latency P50 (s)	Latency P90 (s)	Latency P99 (s)			Throughput (Mbit/s)
Run 1	3169917	        0.00007295	0.00007871	0.00009343		Run 1	21749.43
Run 2	3228290	        0.00007103	0.00007679	0.00009215		Run 2	21897.17
Run 3	3226746	        0.00007231	0.00007871	0.00009087		Run 3	21906.82
Run 4	3191258	        0.00007231	0.00007743	0.00009087		Run 4	21155.15
Run 5	3235653	        0.00007231	0.00007743	0.00008703		Run 5	21397.06
Average	3210372.8	0.000072182	0.000077814	0.00009087		Average	21621.126

cpumap v2 Olek

	Transactions	Latency P50 (s)	Latency P90 (s)	Latency P99 (s)			Throughput (Mbit/s)
Run 1	3253651	        0.00007167	0.00007807	0.00009343		Run 1	13497.57
Run 2	3221492	        0.00007231	0.00007743	0.00009087		Run 2	12115.53
Run 3	3296453	        0.00007039	0.00007807	0.00009087		Run 3	12323.38
Run 4	3254460	        0.00007167	0.00007807	0.00009087		Run 4	12901.88
Run 5	3173327	        0.00007295	0.00007871	0.00009215		Run 5	12593.22
Average	3239876.6	0.000071798	0.00007807	0.000091638		Average	12686.316
Delta	0.92%	        -0.53%	        0.33%	        0.85%			        -41.32%


It's very interesting that we see -40% tput w/ the patches. I went back

Oh no, I messed up something =\

Could you please also test not the whole series, but patches 1-3 (up to
"bpf:cpumap: switch to GRO...") and 1-4 (up to "bpf: cpumap: reuse skb
array...")? Would be great to see whether this implementation works
worse right from the start or I just broke something later on.

Patches 1-3 reproduces the -40% tput numbers.

Ok, thanks! Seems like using the hybrid approach (GRO, but on top of
cpumap's kthreads instead of NAPI) really performs worse than switching
cpumap to NAPI.

quoted

With patches 1-4 the numbers get slightly worse (~1gbps lower) but it was noisy.

Interesting, I was sure patch 4 optimizes stuff... Maybe I'll give up on it.

quoted

tcp_rr results were unaffected.

@ Jakub,

Looks like I can't just use GRO without Lorenzo's conversion to NAPI, at
least for now =\ I took a look on the backlog NAPI and it could be used,
although we'd need a pointer in the backlog to the corresponding cpumap
+ also some synchronization point to make sure backlog NAPI won't access
already destroyed cpumap.

Maybe Lorenzo could take a look...

it seems to me the only difference would be we will use the shared backlog_napi
kthreads instead of having a dedicated kthread for each cpumap entry but we still
need the napi poll logic. I can look into it if you prefer the shared kthread
approach.
@Jakub: what do you think?

Regards,
Lorenzo

Thanks,
Olek

Attachments

signature.asc [application/pgp-signature] 228 bytes

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help