RE: [PATCH v2] net: mana: Use page pool fragments for RX buffers instead of full pages to improve memory efficiency.
From: Haiyang Zhang <haiyangz@microsoft.com>
Date: 2025-07-23 19:55:52
Also in:
bpf, linux-hyperv, linux-rdma, lkml
-----Original Message----- From: Dipayaan Roy <redacted> Sent: Wednesday, July 23, 2025 3:07 PM To: horms@kernel.org; kuba@kernel.org; KY Srinivasan <kys@microsoft.com>; Haiyang Zhang [off-list ref]; wei.liu@kernel.org; Dexuan Cui [off-list ref]; andrew+netdev@lunn.ch; davem@davemloft.net; edumazet@google.com; pabeni@redhat.com; Long Li [off-list ref]; Konstantin Taranov [off-list ref]; ast@kernel.org; daniel@iogearbox.net; hawk@kernel.org; john.fastabend@gmail.com; sdf@fomichev.me; lorenzo@kernel.org; michal.kubiak@intel.com; ernis@linux.microsoft.com; shradhagupta@linux.microsoft.com; Shiraz Saleem [off-list ref]; rosenp@gmail.com; netdev@vger.kernel.org; linux-hyperv@vger.kernel.org; linux-rdma@vger.kernel.org; bpf@vger.kernel.org; linux-kernel@vger.kernel.org; ssengar@linux.microsoft.com; Dipayaan Roy [off-list ref] Subject: [PATCH v2] net: mana: Use page pool fragments for RX buffers instead of full pages to improve memory efficiency. This patch enhances RX buffer handling in the mana driver by allocating pages from a page pool and slicing them into MTU-sized fragments, rather than dedicating a full page per packet. This approach is especially beneficial on systems with large page sizes like 64KB. Key improvements: - Proper integration of page pool for RX buffer allocations. - MTU-sized buffer slicing to improve memory utilization. - Reduce overall per Rx queue memory footprint. - Automatic fallback to full-page buffers when: * Jumbo frames are enabled (MTU > PAGE_SIZE / 2). * The XDP path is active, to avoid complexities with fragment reuse. - Removal of redundant pre-allocated RX buffers used in scenarios like MTU changes, ensuring consistency in RX buffer allocation. Testing on VMs with 64KB pages shows around 200% throughput improvement. Memory efficiency is significantly improved due to reduced wastage in page allocations. Example: We are now able to fit 35 rx buffers in a single 64kb page for MTU size of 1500, instead of 1 rx buffer per page previously. Tested: - iperf3, iperf2, and nttcp benchmarks. - Jumbo frames with MTU 9000. - Native XDP programs (XDP_PASS, XDP_DROP, XDP_TX, XDP_REDIRECT) for testing the XDP path in driver. - Page leak detection (kmemleak). - Driver load/unload, reboot, and stress scenarios. Signed-off-by: Dipayaan Roy <redacted> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Saurabh Sengar <ssengar@linux.microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>