[PATCH v3 bpf-next 08/11] bpf: tcp: Reject BPF_SOCK_OPS_RCVQ_CB if receive queue is not empty.
From: Kuniyuki Iwashima <kuniyu@google.com>
Date: 2026-05-23 08:30:11
Also in:
bpf
Subsystem:
bpf [general] (safe dynamic programs and tools), bpf [networking] (tcx & tc bpf, sock_addr), networking [general], the rest · Maintainers:
Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Eduard Zingerman, Kumar Kartikeya Dwivedi, Martin KaFai Lau, "David S. Miller", Eric Dumazet, Jakub Kicinski, Paolo Abeni, Linus Torvalds
Unlike SOCKMAP, BPF_SOCK_OPS_RCVQ_CB does not iterate existing
skbs in the receive queue when it is enabled for the first time.
In practical production use cases, this behavior is usually not
a problem.
We can safely assume that the upper-layer protocol is designed
with specific synchronisation points where the connection is
temporarily quiet.
At these points, the application can completely drain the receive
queue and safely enable BPF_SOCK_OPS_RCVQ_CB while no skbs are
pending.
A prime example is an application transitioning from HTTP to an
RPC protocol:
Client Server
| |
| --- HTTP Upgrade request ---------> |
| | [Drain all skbs]
| | [Enable BPF_SOCK_OPS_RCVQ_CB]
| <-- HTTP 200/Switching protocol --- |
| |
| --- RPC Frame 1 ------------------> |
However, to strictly prevent any potential race conditions arising
from unconventional upper-layer protocol designs, let's explicitly
signal a failure if BPF_SOCK_OPS_RCVQ_CB is enabled while the receive
queue is not empty.
-EUCLEAN is chosen to indicate that the caller needs to clean up
the receive queue before enabling the feature.
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
net/core/filter.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/net/core/filter.c b/net/core/filter.c
index 82ec2291d6f0..4041b9fc1c74 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -5390,6 +5390,9 @@ static int __bpf_sock_ops_cb_flags_set(struct sock *sk, int val)
return 0;
}
+ if (!skb_queue_empty(&sk->sk_receive_queue))
+ return -EUCLEAN;
+
if (unlikely(sk_is_mptcp(sk)))
return -EOPNOTSUPP;
--
2.54.0.746.g67dd491aae-goog