Re: [RFC PATCH 00/14] Introducing AF_PACKET V4 support (AF_XDP or AF_CHANNEL?)
From: Björn Töpel <hidden>
Date: 2017-11-14 19:01:02
2017-11-14 18:19 GMT+01:00 Jesper Dangaard Brouer [off-list ref]:
On Mon, 13 Nov 2017 22:07:47 +0900 Björn Töpel [off-list ref] wrote:quoted
I'll summarize the major points, that we'll address in the next RFC below. * Instead of extending AF_PACKET with yet another version, introduce a new address/packet family. As for naming had some name suggestions: AF_CAPTURE, AF_CHANNEL, AF_XDP and AF_ZEROCOPY. We'll go for AF_ZEROCOPY, unless there're no strong opinions against it.I mostly like AF_CHANNEL and AF_XDP. I do know XDP is/have-evolved-into a kernel-side facility, that moves XDP-frames/packets _inside_ the kernel. *BUT* I've always imagined, that we would create a "channel" to userspace. By using XDP_REDIRECT to choose what frames get redirected into which userspace "channel" (new channel-map type). Userspace pre-allocate and register memory/pages exactly like this patchset. [Step-1]: (non-ZC) XDP_REDIRECT need to copy frame-data into userspace memory pages. And update your packet_array etc. (Use map-flush to get RX bulking). [Step 2]: (ZC) Userspace call driver NDO to register pages. The XDP_REDIRECT action happens in driver, and can have knowledge about RX-ring. It can know if this RX-ring is Zero-Copy enabled and can skip the copy-step.
Jesper, I *really* like this approach -- especially the fact that the existing XDP path in the drivers can be reused. I'll spend some time dissecting the details of your suggestion.
quoted
* No explicit zerocopy enablement. Use the zeropcopy path if supported, if not -- fallback to the skb path, for netdevs that don't support the required ndos.When driver does not support NDO in above model. I think, that there will still be a significant performance boost for the non-ZC variant. Even-though we need a copy-operation, because there are no memory allocations. As userspace have preallocated and registered pages with the kernel (and mem-limits are implicit via mem-size reg by userspace).
Yup, and we're not paying for the whole skb creation, given that we execute from XDP_DRV and not XDP_SKB.
quoted
* Do not introduce a new XDP action XDP_PASS_TO_KERNEL, instead use XDP redirect map call with ingress flag.In above model, XDP_REDIRECT is used for filtering into a userspace "channel". If ZC gets enabled on a RX-ring queue, then XDP_PASS have to do a copy (RX-ring knowledge is avail), like you describe with XDP_PASS_TO_KERNEL.
Again, this fits nicely in.
quoted
* Extend the XDP redirect to support explicit allocator/destructor functions. Right now, XDP redirect assumes that the page allocator was used, and the XDP redirect cleanup path is decreasing the page count of the XDP buffer. This assumption breaks for the zerocopy case.Yes, please. If XDP_REDIRECT get call a destructor call-back, then we can allow XDP_REDIRECT out another net_device, even-when ZC is enabled on a RX-ring queue. -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer