[PATCH net-next v4 0/3] net: devmem: allow rx-buf-size > PAGE_SIZE per binding
From: Bobby Eshleman <hidden>
Date: 2026-07-01 19:22:33
Also in:
dri-devel, linux-kselftest, linux-media, lkml
Every devmem dmabuf binding hands the page_pool PAGE_SIZE niovs today.
On NICs that consume one descriptor per netmem, this caps a single RX
descriptor at PAGE_SIZE and burns CPU on buffer churn.
In this series, we add a bind-time netlink attribute,
NETDEV_A_DMABUF_RX_BUF_SIZE, that lets userspace request a larger niov size
(power of two >= PAGE_SIZE). Drivers must opt in via
queue_mgmt_ops.QCFG_RX_PAGE_SIZE.
Selftests use udmabuf, but udmabuf sgtables were previously hardcoded to
PAGE_SIZE. This series modifies udmabuf to respect folio sizes in its exported
sgtable. The result is that when backing udmabuf with MFD_HUGETLB 2MB pages,
the sgtable is populated with 2MB entries, allowing devmem's gen_pool to carve
out large (eg. 64K) niovs.
Measurements
------------
Setup: kperf devmem RX/TX cuda, 4 flows, 64 MB messages, 60s, dctcp,
num-rx-queues=4, dmabuf-rx/tx-size-mb=2048, 10 runs per niov size,
mlx5.
niov RX dev Gbps RX flow avg Gbps app sys %
----- ---------------- ----------------- ----------------
4K 300.63 +/- 53.21 75.16 +/- 13.30 54.15 +/- 10.23
16K 321.35 +/- 28.20 80.34 +/- 7.05 41.05 +/- 8.87
32K 347.63 +/- 2.20 86.91 +/- 0.55 44.54 +/- 3.51
64K 332.11 +/- 14.26 83.03 +/- 3.56 35.47 +/- 3.11
RX app sys % drops ~19% from 4K to 64K.
kperf support (not yet merged):
https://github.com/facebookexperimental/kperf/commit/8837577f920876bce6986ec18869ac04439ebcd2
Signed-off-by: Bobby Eshleman <redacted>
---
Changes in v4:
- ncdevmem: fix the possible overflow in ncdevmem (Sashiko)
- drop the udmabuf patch because the fix is now already in net-next
- silenced two pylint complaints in devmem_lib.py
- Link to v3: https://lore.kernel.org/r/20260612-tcpdm-large-niovs-v3-0-a3b693e76fcb@meta.com (local)
Changes in v3:
- fix a bunch of non-reverse christmas tree declarations (Stan)
- remove extra uint32 cast for getpagesize() (Stan)
- remove overzealous strtoul checking (Stan)
- remove value checks that the kernel already performs on rx_buf_size
(Stan)
- Link to v2: https://lore.kernel.org/r/20260611-tcpdm-large-niovs-v2-0-ee2bf15e7523@meta.com (local)
Changes in v2:
- Use NL_SET_ERR_MSG_FMT for sg alignment failure details (Stan)
- Keep -E2BIG (not a direct ask, but seemed preferred, Stan)
- Update udmabuf commit message and comments explaining why
"one sg ent per folio" is useful (Christian)
- Set/restore nr_hugepages in py harness (Stan)
- Link to v1: https://lore.kernel.org/r/20260603-tcpdm-large-niovs-v1-0-f37a4ac6726c@meta.com (local)
---
Bobby Eshleman (3):
net: devmem: allow rx-buf-size > PAGE_SIZE per dmabuf binding
selftests/net: ncdevmem: add -b option to set rx-buf-size on bind
selftests/net: devmem.py: add check_rx_large_niov
Documentation/netlink/specs/netdev.yaml | 8 +++
include/uapi/linux/netdev.h | 1 +
net/core/devmem.c | 55 +++++++++++---------
net/core/devmem.h | 13 +++--
net/core/netdev-genl-gen.c | 5 +-
net/core/netdev-genl.c | 19 ++++++-
tools/include/uapi/linux/netdev.h | 1 +
tools/testing/selftests/drivers/net/hw/devmem.py | 12 ++++-
.../testing/selftests/drivers/net/hw/devmem_lib.py | 59 +++++++++++++++++++++-
tools/testing/selftests/drivers/net/hw/ncdevmem.c | 36 +++++++++++--
.../testing/selftests/drivers/net/hw/nk_devmem.py | 11 +++-
11 files changed, 180 insertions(+), 40 deletions(-)
---
base-commit: 805185b7c7a1069e407b6f7b3bc98e44d415f484
change-id: 20260602-tcpdm-large-niovs-56523a3a1077
Best regards,
--
Bobby Eshleman [off-list ref]