Thread (4 messages) 4 messages, 2 authors, 19d ago
COLD19d
Revisions (2)
  1. v6 [diff vs current]
  2. v7 current

[PATCH net-next V7 0/2] net/mlx5: Avoid payload in skb's linear part for better GRO-processing

From: Tariq Toukan <tariqt@nvidia.com>
Date: 2026-06-01 06:15:55
Also in: linux-rdma, lkml

Hi,

This is V7 of a series originally submitted by Christoph.

When LRO is enabled on the MLX, mlx5e_skb_from_cqe_mpwrq_nonlinear
copies parts of the payload to the linear part of the skb.

This triggers suboptimal processing in GRO, causing slow throughput.

This patch series addresses this by using eth_get_headlen to compute the
size of the protocol headers and only copy those bits. This results in a
significant throughput improvement (detailed results in the specific
patch).

Regards,
Tariq

---

V7:
- Drop cache aligned memcpy patch as it no longer shows benefits on
  further testing on other hosts.
- For XDP, pull at most ETH_HLEN bytes into linear part.
- Fix skb pull length calculation for XDP (Amery Hung).
- Switched from min_t() to min() to avoid skb->data_len 16 bit
  truncation (David Laigh).
- Improved commit message for last patch to make it clear
  that the benchmark is not on native XDP (Sashiko).

V6:
https://lore.kernel.org/all/20260507095330.318892-1-tariqt@nvidia.com/ (local)

Christoph Paasch (2):
  net/mlx5e: DMA-sync earlier in mlx5e_skb_from_cqe_mpwrq_nonlinear
  net/mlx5e: Avoid copying payload to the skb's linear part

 .../net/ethernet/mellanox/mlx5/core/en_rx.c   | 33 ++++++++++++-------
 1 file changed, 22 insertions(+), 11 deletions(-)


base-commit: 8415598365503ced2e3d019491b0a2756c85c494
-- 
2.44.0
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help