Re: [PATCH v2 3/6] net: ocelot: pre-compute injection frame header content

[PATCH v2 0/6] Add FDMA support on ocelot switch driver · Clément Léger <clement.leger@bootlin.com> · 2021-11-03
[PATCH v2 1/6] net: ocelot: add support to get port mac from device-tree · Clément Léger <clement.leger@bootlin.com> · 2021-11-03
Re: [PATCH v2 1/6] net: ocelot: add support to get port mac from device-tree · Vladimir Oltean <vladimir.oltean@nxp.com> · 2021-11-03
Re: [PATCH v2 1/6] net: ocelot: add support to get port mac from device-tree · Julian Wiedmann <hidden> · 2021-11-15
Re: [PATCH v2 1/6] net: ocelot: add support to get port mac from device-tree · Clément Léger <clement.leger@bootlin.com> · 2021-11-15
[PATCH v2 2/6] dt-bindings: net: convert mscc,vsc7514-switch bindings to yaml · Clément Léger <clement.leger@bootlin.com> · 2021-11-03
Re: [PATCH v2 2/6] dt-bindings: net: convert mscc,vsc7514-switch bindings to yaml · Vladimir Oltean <vladimir.oltean@nxp.com> · 2021-11-03
Re: [PATCH v2 2/6] dt-bindings: net: convert mscc,vsc7514-switch bindings to yaml · Clément Léger <clement.leger@bootlin.com> · 2021-11-08
Re: [PATCH v2 2/6] dt-bindings: net: convert mscc,vsc7514-switch bindings to yaml · Rob Herring <robh@kernel.org> · 2021-11-12
[PATCH v2 3/6] net: ocelot: pre-compute injection frame header content · Clément Léger <clement.leger@bootlin.com> · 2021-11-03
Re: [PATCH v2 3/6] net: ocelot: pre-compute injection frame header content · Vladimir Oltean <vladimir.oltean@nxp.com> · 2021-11-03
Re: [PATCH v2 3/6] net: ocelot: pre-compute injection frame header content · Clément Léger <clement.leger@bootlin.com> · 2021-11-03
Re: [PATCH v2 3/6] net: ocelot: pre-compute injection frame header content · Clément Léger <clement.leger@bootlin.com> · 2021-11-15
Re: [PATCH v2 3/6] net: ocelot: pre-compute injection frame header content · Vladimir Oltean <vladimir.oltean@nxp.com> · 2021-11-15
Re: [PATCH v2 3/6] net: ocelot: pre-compute injection frame header content · Clément Léger <clement.leger@bootlin.com> · 2021-11-15
Re: [PATCH v2 3/6] net: ocelot: pre-compute injection frame header content · Jakub Kicinski <kuba@kernel.org> · 2021-11-15
Re: [PATCH v2 3/6] net: ocelot: pre-compute injection frame header content · Clément Léger <clement.leger@bootlin.com> · 2021-11-15
Re: [PATCH v2 3/6] net: ocelot: pre-compute injection frame header content · Vladimir Oltean <vladimir.oltean@nxp.com> · 2021-11-15
Re: [PATCH v2 3/6] net: ocelot: pre-compute injection frame header content · Clément Léger <clement.leger@bootlin.com> · 2021-11-15
[PATCH v2 4/6] net: ocelot: add support for ndo_change_mtu · Clément Léger <clement.leger@bootlin.com> · 2021-11-03
Re: [PATCH v2 4/6] net: ocelot: add support for ndo_change_mtu · Vladimir Oltean <vladimir.oltean@nxp.com> · 2021-11-03
Re: [PATCH v2 4/6] net: ocelot: add support for ndo_change_mtu · Clément Léger <clement.leger@bootlin.com> · 2021-11-03
[PATCH v2 6/6] net: ocelot: add jumbo frame support for FDMA · Clément Léger <clement.leger@bootlin.com> · 2021-11-03
Re: [PATCH v2 6/6] net: ocelot: add jumbo frame support for FDMA · Vladimir Oltean <vladimir.oltean@nxp.com> · 2021-11-03
Re: [PATCH v2 6/6] net: ocelot: add jumbo frame support for FDMA · Clément Léger <clement.leger@bootlin.com> · 2021-11-03
[PATCH v2 5/6] net: ocelot: add FDMA support · Clément Léger <clement.leger@bootlin.com> · 2021-11-03
Re: [PATCH v2 5/6] net: ocelot: add FDMA support · Denis Kirjanov <hidden> · 2021-11-03
Re: [PATCH v2 5/6] net: ocelot: add FDMA support · Vladimir Oltean <vladimir.oltean@nxp.com> · 2021-11-03
Re: [PATCH v2 5/6] net: ocelot: add FDMA support · Clément Léger <clement.leger@bootlin.com> · 2021-11-03
Re: [PATCH v2 0/6] Add FDMA support on ocelot switch driver · Denis Kirjanov <hidden> · 2021-11-03

From: Vladimir Oltean <vladimir.oltean@nxp.com>
Date: 2021-11-15 10:52:03
Also in: lkml, netdev

On Mon, Nov 15, 2021 at 11:13:44AM +0100, Clément Léger wrote:

Le Wed, 3 Nov 2021 14:53:51 +0100,
Clément Léger [off-list ref] a écrit :

quoted

Le Wed, 3 Nov 2021 12:38:12 +0000,
Vladimir Oltean [off-list ref] a écrit :

quoted

On Wed, Nov 03, 2021 at 10:19:40AM +0100, Clément Léger wrote:

quoted

IFH preparation can take quite some time on slow processors (up to
5% in a iperf3 test for instance). In order to reduce the cost of
this preparation, pre-compute IFH since most of the parameters are
fixed per port. Only rew_op and vlan tag will be set when sending
if different than 0. This allows to remove entirely the calls to
packing() with basic usage. In the same time, export this function
that will be used by FDMA.

Signed-off-by: Clément Léger <clement.leger@bootlin.com>
---

Honestly, this feels a bit cheap/gimmicky, and not really the
fundamental thing to address. In my testing of a similar idea (see
commits 67c2404922c2 ("net: dsa: felix: create a template for the DSA
tags on xmit") and then 7c4bb540e917 ("net: dsa: tag_ocelot: create
separate tagger for Seville"), the net difference is not that stark,
considering that now you need to access one more memory region which
you did not need before, do a memcpy, and then patch the IFH anyway
for the non-constant stuff.

The memcpy is neglectable and the patching happens only in a few
cases (at least vs the packing function call). The VSC7514 CPU is really
slow and lead to 2.5% up to 5% time spent in packing() when using iperf3
and depending on the use case (according to ftrace).

quoted

Certainly, for the calls to ocelot_port_inject_frame() from DSA, I
would prefer not having this pre-computed IFH.

Could you provide some before/after performance numbers and perf
counters?

I will make another round of measure to confirm my previous number and
check the impact on the injection rate on ocelot.

I checked again my bandwith numbers (obtained with iperf3) with and
without the pre-computed header:

Test on standard packets with UDP (iperf3 -t 100 -l 1460 -u -b 0 -c *)
- With pre-computed header: UDP TX: 	33Mbit/s
- Without UDP TX: 			31Mbit/s
-> 6.5% improvement

Test on small packets with UDP (iperf3 -t 100 -l 700 -u -b 0 -c *)
- With pre-computed header: UDP TX: 	15.8Mbit/s
- Without UDP TX: 			16.4Mbit/s
-> 4.3% improvement

The improvement might not be huge but also not negligible at all.
Please tell me if you want me to drop it or not based on those numbers.

Is this with manual injection or with FDMA? Do you have before/after
numbers with FDMA as well? At 31 vs 33 Mbps, this isn't going to compete
for any races anyway :)

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help