Thread (9 messages) 9 messages, 6 authors, 2025-07-31

RE: [PATCH v2] net: mana: Use page pool fragments for RX buffers instead of full pages to improve memory efficiency.

From: Haiyang Zhang <haiyangz@microsoft.com>
Date: 2025-07-23 19:55:52
Also in: bpf, linux-hyperv, linux-rdma, lkml

-----Original Message-----
From: Dipayaan Roy <redacted>
Sent: Wednesday, July 23, 2025 3:07 PM
To: horms@kernel.org; kuba@kernel.org; KY Srinivasan <kys@microsoft.com>;
Haiyang Zhang [off-list ref]; wei.liu@kernel.org; Dexuan Cui
[off-list ref]; andrew+netdev@lunn.ch; davem@davemloft.net;
edumazet@google.com; pabeni@redhat.com; Long Li [off-list ref];
Konstantin Taranov [off-list ref]; ast@kernel.org;
daniel@iogearbox.net; hawk@kernel.org; john.fastabend@gmail.com;
sdf@fomichev.me; lorenzo@kernel.org; michal.kubiak@intel.com;
ernis@linux.microsoft.com; shradhagupta@linux.microsoft.com; Shiraz Saleem
[off-list ref]; rosenp@gmail.com; netdev@vger.kernel.org;
linux-hyperv@vger.kernel.org; linux-rdma@vger.kernel.org;
bpf@vger.kernel.org; linux-kernel@vger.kernel.org;
ssengar@linux.microsoft.com; Dipayaan Roy [off-list ref]
Subject: [PATCH v2] net: mana: Use page pool fragments for RX buffers
instead of full pages to improve memory efficiency.

This patch enhances RX buffer handling in the mana driver by allocating
pages from a page pool and slicing them into MTU-sized fragments, rather
than dedicating a full page per packet. This approach is especially
beneficial on systems with large page sizes like 64KB.

Key improvements:

- Proper integration of page pool for RX buffer allocations.
- MTU-sized buffer slicing to improve memory utilization.
- Reduce overall per Rx queue memory footprint.
- Automatic fallback to full-page buffers when:
   * Jumbo frames are enabled (MTU > PAGE_SIZE / 2).
   * The XDP path is active, to avoid complexities with fragment reuse.
- Removal of redundant pre-allocated RX buffers used in scenarios like MTU
  changes, ensuring consistency in RX buffer allocation.

Testing on VMs with 64KB pages shows around 200% throughput improvement.
Memory efficiency is significantly improved due to reduced wastage in page
allocations. Example: We are now able to fit 35 rx buffers in a single
64kb
page for MTU size of 1500, instead of 1 rx buffer per page previously.

Tested:

- iperf3, iperf2, and nttcp benchmarks.
- Jumbo frames with MTU 9000.
- Native XDP programs (XDP_PASS, XDP_DROP, XDP_TX, XDP_REDIRECT) for
  testing the XDP path in driver.
- Page leak detection (kmemleak).
- Driver load/unload, reboot, and stress scenarios.

Signed-off-by: Dipayaan Roy <redacted>

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Saurabh Sengar <ssengar@linux.microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help