[PATCH net-next v3 0/4] Add support to do threaded napi busy poll
From: Samiullah Khawaja <hidden>
Date: 2025-02-05 00:10:54
Extend the already existing support of threaded napi poll to do continuous busy polling. This is used for doing continuous polling of napi to fetch descriptors from backing RX/TX queues for low latency applications. Allow enabling of threaded busypoll using netlink so this can be enabled on a set of dedicated napis for low latency applications. It allows enabling NAPI busy poll for any userspace application indepdendent of userspace API being used for packet and event processing (epoll, io_uring, raw socket APIs). Once enabled user can fetch the PID of the kthread doing NAPI polling and set affinity, priority and scheduler for it depending on the low-latency requirements. Currently threaded napi is only enabled at device level using sysfs. Add support to enable/disable threaded mode for a napi individually. This can be done using the netlink interface. Extend `napi-set` op in netlink spec that allows setting the `threaded` attribute of a napi. Extend the threaded attribute in napi struct to add an option to enable continuous busy polling. Extend the netlink and sysfs interface to allow enabled/disabling threaded busypolling at device or individual napi level. We use this for our AF_XDP based hard low-latency usecase using onload stack (https://github.com/Xilinx-CNS/onload) that runs in userspace. Our usecase is a fixed frequency RPC style traffic with fixed request/response size. We simulated this using neper by only starting next transaction when last one has completed. The experiment results are listed below, Setup: - Running on Google C3 VMs with idpf driver with following configurations. - IRQ affinity and coalascing is common for both experiments. - There is only 1 RX/TX queue configured. - First experiment enables busy poll using sysctl for both epoll and socket APIs. - Second experiment enables NAPI threaded busy poll for the full device using sysctl. Non threaded NAPI busy poll enabled using sysctl.
echo 400 | sudo tee /proc/sys/net/core/busy_poll
echo 400 | sudo tee /proc/sys/net/core/busy_read
echo 2 | sudo tee /sys/class/net/eth0/napi_defer_hard_irqs
echo 15000 | sudo tee /sys/class/net/eth0/gro_flush_timeout
Results using following command,
sudo EF_NO_FAIL=0 EF_POLL_USEC=100000 taskset -c 3-10 onload -v \
--profile=latency ./neper/tcp_rr -Q 200 -R 400 -T 1 -F 50 \
-p 50,90,99,999 -H <IP> -l 10
...
...
num_transactions=2835
latency_min=0.000018976
latency_max=0.049642100
latency_mean=0.003243618
latency_stddev=0.010636847
latency_p50=0.000025270
latency_p90=0.005406710
latency_p99=0.049807350
latency_p99.9=0.049807350
Results with napi threaded busy poll using following command,
sudo EF_NO_FAIL=0 EF_POLL_USEC=100000 taskset -c 3-10 onload -v \
--profile=latency ./neper/tcp_rr -Q 200 -R 400 -T 1 -F 50 \
-p 50,90,99,999 -H <IP> -l 10
...
...
num_transactions=460163
latency_min=0.000015707
latency_max=0.200182942
latency_mean=0.000019453
latency_stddev=0.000720727
latency_p50=0.000016950
latency_p90=0.000017270
latency_p99=0.000018710
latency_p99.9=0.000020150
Here with NAPI threaded busy poll in a separate core, we are able to consistently poll the NAPI to keep latency to absolute minimum. And also we are able to do this without any major changes to the onload stack and threading model. v3: - Fixed calls to dev_set_threaded in drivers v2: - Add documentation in napi.rst. - Provide experiment data and usecase details. - Update busy_poller selftest to include napi threaded poll testcase. - Define threaded mode enum in netlink interface. - Included NAPI threaded state in napi config to save/restore. Samiullah Khawaja (4): Add support to set napi threaded for individual napi net: Create separate gro_flush helper function Extend napi threaded polling to allow kthread based busy polling selftests: Add napi threaded busy poll test in `busy_poller` Documentation/ABI/testing/sysfs-class-net | 3 +- Documentation/netlink/specs/netdev.yaml | 14 ++ Documentation/networking/napi.rst | 80 ++++++++++- .../net/ethernet/atheros/atl1c/atl1c_main.c | 2 +- drivers/net/ethernet/mellanox/mlxsw/pci.c | 2 +- drivers/net/ethernet/renesas/ravb_main.c | 2 +- drivers/net/wireless/ath/ath10k/snoc.c | 2 +- include/linux/netdevice.h | 24 +++- include/uapi/linux/netdev.h | 7 + net/core/dev.c | 127 ++++++++++++++---- net/core/net-sysfs.c | 2 +- net/core/netdev-genl-gen.c | 5 +- net/core/netdev-genl.c | 9 ++ tools/include/uapi/linux/netdev.h | 7 + tools/testing/selftests/net/busy_poll_test.sh | 25 +++- tools/testing/selftests/net/busy_poller.c | 14 +- 16 files changed, 285 insertions(+), 40 deletions(-) -- 2.48.1.362.g079036d154-goog