After an invalid descriptor is found in __xsk_generic_xmit(),
xskq_cons_peek_desc() returns false and the loop body is not entered.
Jason's drain fixes reclaim descriptors already attached to xs->skb and
later continuation descriptors handled through drain_cont, but the
offending descriptor that made peek fail is only released from the Tx
ring.
This loses one completion for each invalid multi-buffer packet in the
generic path. Userspace then waits forever for a descriptor that has
already been consumed by the kernel.
If the failed descriptor belongs to an already-started or already-draining
multi-buffer packet, publish its address to the completion ring before
releasing it. Standalone invalid descriptors keep the existing behavior.
Fixes: cf24f5a5feea ("xsk: add support for AF_XDP multi-buffer on Tx path")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
net/xdp/xsk.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index c489fadc3608..43791647cf18 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -1125,8 +1125,22 @@ static int __xsk_generic_xmit(struct sock *sk)
}
if (xskq_has_descs(xs->tx)) {
+ bool reclaim_desc = xs->skb || xs->drain_cont;
+
+ if (reclaim_desc) {
+ err = xsk_cq_reserve_locked(xs->pool);
+ if (err) {
+ err = -EAGAIN;
+ goto out;
+ }
+ }
+
if (xs->skb)
xsk_drop_skb(xs->skb);
+
+ if (reclaim_desc)
+ xsk_cq_submit_addr_single_locked(xs->pool, &desc);
+
xskq_cons_release(xs->tx);
xs->drain_cont = xp_mb_desc(&desc);
}--
2.43.0