Thread (39 messages) 39 messages, 6 authors, 2025-08-13

Re: [PATCH bpf-next V2 0/7] xdp: Allow BPF to set RX hints for XDP_REDIRECTed packets

From: Jakub Kicinski <kuba@kernel.org>
Date: 2025-08-01 20:38:05
Also in: bpf

On Thu, 31 Jul 2025 18:27:07 +0200 Jesper Dangaard Brouer wrote:
quoted
iirc, a xdp prog can be attached to a cpumap. The skb can be created by 
that xdp prog running on the remote cpu. It should be like a xdp prog 
returning a XDP_PASS + an optional skb. The xdp prog can set some fields 
in the skb. Other than setting fields in the skb, something else may be 
also possible in the future, e.g. look up sk, earlier demux ...etc.
I have strong reservations about having the BPF program itself trigger
the SKB allocation. I believe this would fundamentally break the
performance model that makes cpumap redirect so effective.
See, I have similar concerns about growing struct xdp_frame.

That's why the guiding principle for me would be to make sure that 
the features we add, beyond "classic XDP" as needed by DDoS, are
entirely optional. And if we include the goal of moving skb allocation
out of the driver to the xdp_frame growth, the drivers will sooner or
later unconditionally populate the xdp_frame. Decreasing performance
of "classic XDP"?
The key to XDP's high performance lies in processing a bulk of
xdp_frames in a tight loop to amortize costs. The existing cpumap code
on the remote CPU is already highly optimized for this: it performs bulk
allocation of SKBs and uses careful prefetching to hide the memory
latency. Allowing a BPF program to sometimes trigger a heavyweight SKB
alloc+init (4 cache-line misses) would bypass all these existing
optimizations. It would introduce significant jitter into the pipeline
and disrupt the entire bulk-processing model we rely on for performance.

This performance is not just theoretical; 
Somewhat off-topic for the architecture, I think, but do you happen 
to have any real life data for that? IIRC the "listification" was a
moderate success for the skb path.. Or am I misreading and you have
other benefits of a tight processing loop in mind?
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help