Re: [PATCH net v4 0/5] xsk: fix meta and publish of cq issues
From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Date: 2026-05-22 18:34:08
Also in:
bpf
On Fri, May 22, 2026 at 09:48:39PM +0800, Jason Xing wrote:
On Fri, May 22, 2026 at 4:55 PM Jason Xing [off-list ref] wrote:quoted
On Thu, May 21, 2026 at 10:24 PM Maciej Fijalkowski [off-list ref] wrote:quoted
On Thu, May 21, 2026 at 09:07:30PM +0800, Jason Xing wrote:quoted
On Thu, May 21, 2026 at 9:00 PM Maciej Fijalkowski [off-list ref] wrote:quoted
On Thu, May 21, 2026 at 08:41:08PM +0800, Jason Xing wrote:quoted
On Thu, May 21, 2026 at 8:24 PM Maciej Fijalkowski [off-list ref] wrote:quoted
On Wed, May 20, 2026 at 08:42:39AM +0800, Jason Xing wrote:quoted
From: Jason Xing <kernelxing@tencent.com> The series is the product of previous review from sashiko[1]. 1) META patch 1: address TOCTOU around metadata. 2) PUBLISH of CQ patch 2: make sure xsk_addr->addrs[] can be published to cq when overflow occurs. patch 3: keep cleaning up the continuation descs (more than 17) and publish its address when overflow occurs. patch 4: like patch 3, but only handles the invalid descs cases. [1]: https://lore.kernel.org/all/20260502200722.53960-1-kerneljasonxing@gmail.com/ (local) --- V4 Link: https://lore.kernel.org/all/20260517063311.28921-1-kerneljasonxing@gmail.com/ (local) 1. correct the description of xmit path in patch 3 (sashiko) 2. move set logic into xmit path in patch 3 (Stan) V3 Link: https://lore.kernel.org/all/20260515123018.80147-1-kerneljasonxing@gmail.com/ (local) 1. avoid breaking previous usage of sendto, and siliently handle overflow case (Stan, sashiko) 2. add one particular exception process in patch 4 (sashiko) 3. adjust the selftest to make sure it passes in either virutal or physical machines, which includes add usleep to support physical machine. V2 Link: https://lore.kernel.org/all/20260510012310.88570-1-kerneljasonxing@gmail.com/ (local) 1. adjust selftests (Jakub) 2. add READ_ONCE in patch 1 (Stan)FWIW I still get test failures (yes with patch 5 applied). PTAL.Thanks for the test. But I've tried with ixgbe driver... I noticed there are some flaky tests which have nothing to do with the series. Can you confirm that it's not caused because of the series?That explains the different results as i am using i40e/ice which have multi-buffer support whereas ixgbe does not even support mbuf at XDP. Broken tests are from mbuf cases.That's weird. I never expected the failed tests to be about multi-buffer. Are they the same as the output you attached last time? Or something new? Could you please share it so that I can investigate the root cause?# [is_frag_valid] expected pkt_nb [10], got pkt_nb [0] # DEBUG>> L2: dst mac: # 55# 44# 33# 22# 11# 01# DEBUG>> L2: src mac: # 55# 44# 33# 22# 11# 00# DEBUG>> L5: seqnum: # 0:0 # 0:1 # 0:2 # 0:3 # 0:4 # 0:5 # 0:6 # 0:7 # 0:8 # 0:9 # 0:10 # 0:11 # 0:0 # 0:0 # 0:0 # 0:0 # ....# .... # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # --------------------------------------- not ok 21 FAIL: SKB ALIGNED_INV_DESC_MULTI_BUFF # [is_frag_valid] expected pkt_nb [10], got pkt_nb [0] # DEBUG>> L2: dst mac: # 55# 44# 33# 22# 11# 01# DEBUG>> L2: src mac: # 55# 44# 33# 22# 11# 00# DEBUG>> L5: seqnum: # 0:0 # 0:1 # 0:2 # 0:3 # 0:4 # 0:5 # 0:6 # 0:7 # 0:8 # 0:9 # 0:10 # 0:11 # 0:0 # 0:0 # 0:0 # 0:0 # ....# .... # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # --------------------------------------- not ok 33 FAIL: SKB UNALIGNED_INV_DESC_MULTI_BUFF # [is_frag_valid] expected pkt_nb [10], got pkt_nb [0] # DEBUG>> L2: dst mac: # 55# 44# 33# 22# 11# 01# DEBUG>> L2: src mac: # 55# 44# 33# 22# 11# 00# DEBUG>> L5: seqnum: # 0:0 # 0:1 # 0:2 # 0:3 # 0:4 # 0:5 # 0:6 # 0:7 # 0:8 # 0:9 # 0:10 # 0:11 # 0:0 # 0:0 # 0:0 # 0:0 # ....# .... # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # --------------------------------------- not ok 57 FAIL: DRV ALIGNED_INV_DESC_MULTI_BUFF # [is_frag_valid] expected pkt_nb [10], got pkt_nb [0] # DEBUG>> L2: dst mac: # 55# 44# 33# 22# 11# 01# DEBUG>> L2: src mac: # 55# 44# 33# 22# 11# 00# DEBUG>> L5: seqnum: # 0:0 # 0:1 # 0:2 # 0:3 # 0:4 # 0:5 # 0:6 # 0:7 # 0:8 # 0:9 # 0:10 # 0:11 # 0:0 # 0:0 # 0:0 # 0:0 # ....# .... # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # --------------------------------------- not ok 69 FAIL: DRV UNALIGNED_INV_DESC_MULTI_BUFF # [is_frag_valid] expected pkt_nb [10], got pkt_nb [0] # DEBUG>> L2: dst mac: # 55# 44# 33# 22# 11# 01# DEBUG>> L2: src mac: # 55# 44# 33# 22# 11# 00# DEBUG>> L5: seqnum: # 0:0 # 0:1 # 0:2 # 0:3 # 0:4 # 0:5 # 0:6 # 0:7 # 0:8 # 0:9 # 0:10 # 0:11 # 0:0 # 0:0 # 0:0 # 0:0 # ....# .... # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # --------------------------------------- not ok 93 FAIL: ZC ALIGNED_INV_DESC_MULTI_BUFF # [is_frag_valid] expected pkt_nb [11], got pkt_nb [0] # DEBUG>> L2: dst mac: # 55# 44# 33# 22# 11# 01# DEBUG>> L2: src mac: # 55# 44# 33# 22# 11# 00# DEBUG>> L5: seqnum: # 0:0 # 0:1 # 0:2 # 0:3 # 0:4 # 0:5 # 0:6 # 0:7 # 0:8 # 0:9 # 0:10 # 0:11 # 0:0 # 0:0 # 0:0 # 0:0 # ....# --------------------------------------- not ok 94 FAIL: ZC TOO_MANY_FRAGS # [is_frag_valid] expected pkt_nb [10], got pkt_nb [0] # DEBUG>> L2: dst mac: # 55# 44# 33# 22# 11# 01# DEBUG>> L2: src mac: # 55# 44# 33# 22# 11# 00# DEBUG>> L5: seqnum: # 0:0 # 0:1 # 0:2 # 0:3 # 0:4 # 0:5 # 0:6 # 0:7 # 0:8 # 0:9 # 0:10 # 0:11 # 0:0 # 0:0 # 0:0 # 0:0 # ....# .... # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # --------------------------------------- not ok 105 FAIL: ZC UNALIGNED_INV_DESC_MULTI_BUFF # 4 skipped test(s) detected. Consider enabling relevant config options to improve coverage. # Totals: pass:96 fail:8 xfail:0 xpass:0 skip:4 error:0 XSK_SELFTESTS_ens259f1np1_SOFTIRQ: [ FAIL ] 1..108 # [is_frag_valid] expected pkt_nb [10], got pkt_nb [0] # DEBUG>> L2: dst mac: # 55# 44# 33# 22# 11# 01# DEBUG>> L2: src mac: # 55# 44# 33# 22# 11# 00# DEBUG>> L5: seqnum: # 0:0 # 0:1 # 0:2 # 0:3 # 0:4 # 0:5 # 0:6 # 0:7 # 0:8 # 0:9 # 0:10 # 0:11 # 0:0 # 0:0 # 0:0 # 0:0 # ....# .... # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # --------------------------------------- not ok 21 FAIL: SKB BUSY-POLL ALIGNED_INV_DESC_MULTI_BUFF # [is_frag_valid] expected pkt_nb [10], got pkt_nb [0] # DEBUG>> L2: dst mac: # 55# 44# 33# 22# 11# 01# DEBUG>> L2: src mac: # 55# 44# 33# 22# 11# 00# DEBUG>> L5: seqnum: # 0:0 # 0:1 # 0:2 # 0:3 # 0:4 # 0:5 # 0:6 # 0:7 # 0:8 # 0:9 # 0:10 # 0:11 # 0:0 # 0:0 # 0:0 # 0:0 # ....# .... # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # --------------------------------------- not ok 33 FAIL: SKB BUSY-POLL UNALIGNED_INV_DESC_MULTI_BUFF # [is_frag_valid] expected pkt_nb [10], got pkt_nb [0] # DEBUG>> L2: dst mac: # 55# 44# 33# 22# 11# 01# DEBUG>> L2: src mac: # 55# 44# 33# 22# 11# 00# DEBUG>> L5: seqnum: # 0:0 # 0:1 # 0:2 # 0:3 # 0:4 # 0:5 # 0:6 # 0:7 # 0:8 # 0:9 # 0:10 # 0:11 # 0:0 # 0:0 # 0:0 # 0:0 # ....# .... # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # --------------------------------------- not ok 57 FAIL: DRV BUSY-POLL ALIGNED_INV_DESC_MULTI_BUFF # [is_frag_valid] expected pkt_nb [10], got pkt_nb [0] # DEBUG>> L2: dst mac: # 55# 44# 33# 22# 11# 01# DEBUG>> L2: src mac: # 55# 44# 33# 22# 11# 00# DEBUG>> L5: seqnum: # 0:0 # 0:1 # 0:2 # 0:3 # 0:4 # 0:5 # 0:6 # 0:7 # 0:8 # 0:9 # 0:10 # 0:11 # 0:0 # 0:0 # 0:0 # 0:0 # ....# .... # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # --------------------------------------- not ok 69 FAIL: DRV BUSY-POLL UNALIGNED_INV_DESC_MULTI_BUFF # [is_frag_valid] expected pkt_nb [10], got pkt_nb [0] # DEBUG>> L2: dst mac: # 55# 44# 33# 22# 11# 01# DEBUG>> L2: src mac: # 55# 44# 33# 22# 11# 00# DEBUG>> L5: seqnum: # 0:0 # 0:1 # 0:2 # 0:3 # 0:4 # 0:5 # 0:6 # 0:7 # 0:8 # 0:9 # 0:10 # 0:11 # 0:0 # 0:0 # 0:0 # 0:0 # ....# .... # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # --------------------------------------- not ok 93 FAIL: ZC BUSY-POLL ALIGNED_INV_DESC_MULTI_BUFF # [is_frag_valid] expected pkt_nb [11], got pkt_nb [0] # DEBUG>> L2: dst mac: # 55# 44# 33# 22# 11# 01# DEBUG>> L2: src mac: # 55# 44# 33# 22# 11# 00# DEBUG>> L5: seqnum: # 0:0 # 0:1 # 0:2 # 0:3 # 0:4 # 0:5 # 0:6 # 0:7 # 0:8 # 0:9 # 0:10 # 0:11 # 0:0 # 0:0 # 0:0 # 0:0 # ....# --------------------------------------- # [is_frag_valid] expected pkt_nb [10], got pkt_nb [0] # DEBUG>> L2: dst mac: # 55# 44# 33# 22# 11# 01# DEBUG>> L2: src mac: # 55# 44# 33# 22# 11# 00# DEBUG>> L5: seqnum: # 0:0 # 0:1 # 0:2 # 0:3 # 0:4 # 0:5 # 0:6 # 0:7 # 0:8 # 0:9 # 0:10 # 0:11 # 0:0 # 0:0 # 0:0 # 0:0 # ....# .... # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # 0:0 # --------------------------------------- not ok 105 FAIL: ZC BUSY-POLL UNALIGNED_INV_DESC_MULTI_BUFF # 4 skipped test(s) detected. Consider enabling relevant config options to improve coverage. # Totals: pass:96 fail:8 xfail:0 xpass:0 skip:4 error:0 XSK_SELFTESTS_ens259f1np1_BUSY_POLL: [ FAIL ] Summary: XSK_SELFTESTS_ens259f1np1_SOFTIRQ: [ FAIL ] XSK_SELFTESTS_ens259f1np1_BUSY_POLL: [ FAIL ]Sorry, Maciej. I managed to get one server with i40e nic but still couldn't reproduce it. Can you try the attachment (that is the replacement for v4-0005) instead? I removed those nasty CONT test cases...Ah, I think I eventually figured out a solution. Maciej, could you please test the 2nd patch instead? This patch reworks the CONTD test cases. Cross finger.
Please don't rush things here, I believe we need to think a bit more here. I have second thoughts about overall approach. My understanding wrt CQ was that it is a container that holds descriptors which have been successfully transmitted. Now we want to add also leftover descriptors from broken packets, which might confuse user space sides in case they were relying on behavior described above. The intent is right of course as we don't want to lose UMEM descs, but I feel like we need a separate mechanism for that rather than putting invalid descs to CQ. Does it make sense? Besides, even though we would stay with proposed changes, behavior between modes should be aligned. Right now ZC seems to be broken in touched regions here - when we hit the limit of frags via pool->xdp_zc_max_segs, we break the loop and discard the packet, never post it to CQ and these descs are lost from user space POV. Then we would continue on next call and interpret the rest of too big packet as a separate one (clamped) and therefore submit corrupted packet to HW. I'll be looking at ZC API but i do think we need a common approach, mode-agnostic. Thanks, Maciej
Thanks, Jasonquoted
Really I don't think I have much time to spend on these tests which makes me feel extremely annoyed... It's not easy to analyze the code without a reproducer. The good news is that now I highly suspect that this kind of CONT test cases pollute the whole cq which affects other tests. Before I give up on the 0003/0004 patches, I'd like to hear some advice from you. Thank you. My original intention was to push batch xmit forward but at that time sashiko pointed out some unrelated bugs accidentally. Thanks, Jasonquoted
quoted
Thanks, Jasonquoted
quoted
Thanks, Jasonquoted
quoted
Jason Xing (5): xsk: cache csum_start/csum_offset to fix TOCTOU in xsk_skb_metadata() xsk: fix buffer leak in xsk_drop_skb() for AF_XDP multi-buffer Tx xsk: drain continuation descs after overflow in xsk_build_skb() xsk: drain continuation descs on invalid descriptor in __xsk_generic_xmit() selftests/xsk: drain CQ to wait for TX completion include/net/xdp_sock.h | 1 + net/xdp/xsk.c | 44 +++++++++++++---- .../selftests/bpf/prog_tests/test_xsk.c | 48 +++++++++++-------- 3 files changed, 63 insertions(+), 30 deletions(-) -- 2.43.7