Re: [PATCH v3 bpf-next 03/11] bpf: tcp: Support bpf_skb_load_bytes() for... | netdev

[PATCH v3 bpf-next 00/11] bpf: Add SOCK_OPS hooks for TCP AutoLOWAT. · Kuniyuki Iwashima <kuniyu@google.com> · 2026-05-23
[PATCH v3 bpf-next 01/11] selftest: bpf: Use BPF_SOCK_OPS_ALL_CB_FLAGS + 1 for bad_cb_test_rv. · Kuniyuki Iwashima <kuniyu@google.com> · 2026-05-23
Re: [PATCH v3 bpf-next 01/11] selftest: bpf: Use BPF_SOCK_OPS_ALL_CB_FLAGS + 1 for bad_cb_test_rv. · bot+bpf-ci@kernel.org · 2026-05-23
[PATCH v3 bpf-next 02/11] bpf: tcp: Introduce BPF_SOCK_OPS_RCVQ_CB. · Kuniyuki Iwashima <kuniyu@google.com> · 2026-05-23
[PATCH v3 bpf-next 03/11] bpf: tcp: Support bpf_skb_load_bytes() for BPF_SOCK_OPS_RCVQ_CB. · Kuniyuki Iwashima <kuniyu@google.com> · 2026-05-23
Re: [PATCH v3 bpf-next 03/11] bpf: tcp: Support bpf_skb_load_bytes() for BPF_SOCK_OPS_RCVQ_CB. · Martin KaFai Lau <martin.lau@linux.dev> · 2026-05-26
Re: [PATCH v3 bpf-next 03/11] bpf: tcp: Support bpf_skb_load_bytes() for BPF_SOCK_OPS_RCVQ_CB. · Kuniyuki Iwashima <kuniyu@google.com> · 2026-05-26
Re: [PATCH v3 bpf-next 03/11] bpf: tcp: Support bpf_skb_load_bytes() for BPF_SOCK_OPS_RCVQ_CB. · Martin KaFai Lau <martin.lau@linux.dev> · 2026-05-26
Re: [PATCH v3 bpf-next 03/11] bpf: tcp: Support bpf_skb_load_bytes() for BPF_SOCK_OPS_RCVQ_CB. · Jason Xing <hidden> · 2026-05-27
Re: [PATCH v3 bpf-next 03/11] bpf: tcp: Support bpf_skb_load_bytes() for BPF_SOCK_OPS_RCVQ_CB. · Martin KaFai Lau <martin.lau@linux.dev> · 2026-05-27
Re: [PATCH v3 bpf-next 03/11] bpf: tcp: Support bpf_skb_load_bytes() for BPF_SOCK_OPS_RCVQ_CB. · Martin KaFai Lau <martin.lau@linux.dev> · 2026-05-27
Re: [PATCH v3 bpf-next 03/11] bpf: tcp: Support bpf_skb_load_bytes() for BPF_SOCK_OPS_RCVQ_CB. · Kuniyuki Iwashima <kuniyu@google.com> · 2026-05-27
Re: [PATCH v3 bpf-next 03/11] bpf: tcp: Support bpf_skb_load_bytes() for BPF_SOCK_OPS_RCVQ_CB. · Martin KaFai Lau <martin.lau@linux.dev> · 2026-05-28
Re: [PATCH v3 bpf-next 03/11] bpf: tcp: Support bpf_skb_load_bytes() for BPF_SOCK_OPS_RCVQ_CB. · Jason Xing <hidden> · 2026-05-28
[PATCH v3 bpf-next 04/11] tcp: Split out __tcp_set_rcvlowat(). · Kuniyuki Iwashima <kuniyu@google.com> · 2026-05-23
[PATCH v3 bpf-next 05/11] bpf: tcp: Add kfunc to adjust sk->sk_rcvlowat. · Kuniyuki Iwashima <kuniyu@google.com> · 2026-05-23
Re: [PATCH v3 bpf-next 05/11] bpf: tcp: Add kfunc to adjust sk->sk_rcvlowat. · bot+bpf-ci@kernel.org · 2026-05-23
[PATCH v3 bpf-next 06/11] bpf: tcp: Make BPF_SOCK_OPS_RCVQ_CB and SOCKMAP mutually exclusive. · Kuniyuki Iwashima <kuniyu@google.com> · 2026-05-23
Re: [PATCH v3 bpf-next 06/11] bpf: tcp: Make BPF_SOCK_OPS_RCVQ_CB and SOCKMAP mutually exclusive. · bot+bpf-ci@kernel.org · 2026-05-23
Re: [PATCH v3 bpf-next 06/11] bpf: tcp: Make BPF_SOCK_OPS_RCVQ_CB and SOCKMAP mutually exclusive. · Kuniyuki Iwashima <kuniyu@google.com> · 2026-05-24
[PATCH v3 bpf-next 08/11] bpf: tcp: Reject BPF_SOCK_OPS_RCVQ_CB if receive queue is not empty. · Kuniyuki Iwashima <kuniyu@google.com> · 2026-05-23
Re: [PATCH v3 bpf-next 08/11] bpf: tcp: Reject BPF_SOCK_OPS_RCVQ_CB if receive queue is not empty. · bot+bpf-ci@kernel.org · 2026-05-23
[PATCH v3 bpf-next 07/11] bpf: mptcp: Don't support BPF_SOCK_OPS_RCVQ_CB. · Kuniyuki Iwashima <kuniyu@google.com> · 2026-05-23
[PATCH v3 bpf-next 09/11] bpf: tcp: Factorise bpf_skops_established(). · Kuniyuki Iwashima <kuniyu@google.com> · 2026-05-23
[PATCH v3 bpf-next 10/11] bpf: tcp: Add SOCK_OPS rcvlowat hook. · Kuniyuki Iwashima <kuniyu@google.com> · 2026-05-23
Re: [PATCH v3 bpf-next 10/11] bpf: tcp: Add SOCK_OPS rcvlowat hook. · Martin KaFai Lau <martin.lau@linux.dev> · 2026-05-26
Re: [PATCH v3 bpf-next 10/11] bpf: tcp: Add SOCK_OPS rcvlowat hook. · Kuniyuki Iwashima <kuniyu@google.com> · 2026-05-26
Re: [PATCH v3 bpf-next 10/11] bpf: tcp: Add SOCK_OPS rcvlowat hook. · Amery Hung <hidden> · 2026-05-26
Re: [PATCH v3 bpf-next 10/11] bpf: tcp: Add SOCK_OPS rcvlowat hook. · Kuniyuki Iwashima <kuniyu@google.com> · 2026-05-26
[PATCH v3 bpf-next 11/11] selftest: bpf: Add test for BPF_SOCK_OPS_RCVQ_CB. · Kuniyuki Iwashima <kuniyu@google.com> · 2026-05-23
Re: [PATCH v3 bpf-next 11/11] selftest: bpf: Add test for BPF_SOCK_OPS_RCVQ_CB. · bot+bpf-ci@kernel.org · 2026-05-23
Re: [PATCH v3 bpf-next 11/11] selftest: bpf: Add test for BPF_SOCK_OPS_RCVQ_CB. · Kuniyuki Iwashima <kuniyu@google.com> · 2026-05-24
Re: [PATCH v3 bpf-next 11/11] selftest: bpf: Add test for BPF_SOCK_OPS_RCVQ_CB. · Martin KaFai Lau <martin.lau@linux.dev> · 2026-05-26

Re: [PATCH v3 bpf-next 03/11] bpf: tcp: Support bpf_skb_load_bytes() for BPF_SOCK_OPS_RCVQ_CB.

From: Kuniyuki Iwashima <kuniyu@google.com>
Date: 2026-05-26 21:22:10
Also in: bpf

On Tue, May 26, 2026 at 1:34 PM Martin KaFai Lau [off-list ref] wrote:

On Sat, May 23, 2026 at 08:29:32AM +0000, Kuniyuki Iwashima wrote:

quoted

When a TCP skb is queued to sk->sk_receive_queue, BPF SOCK_OPS
prog can be called with BPF_SOCK_OPS_RCVQ_CB.

In this hook, we want to parse the RPC descriptor in the skb
and adjust sk->sk_rcvlowat based on the RPC frame size.

However, we cannot access payload via bpf_sock_ops.data on
modern NICs with TCP header/data split on as the payload is
not placed in the linear area.

Let's support bpf_skb_load_bytes() for BPF_SOCK_OPS_RCVQ_CB.

Three notes:

  1) bpf_sock_ops_kern.skb will be NULL when the BPF prog is
      invoked from recvmsg().

  2) Access to bpf_sock_ops.data will be disabled by passing
      0 end_offset to bpf_skops_init_skb().

  3) ____bpf_skb_load_bytes() is called directly instead of
     __bpf_skb_load_bytes() to allow compilers to inline it
     instead of generating a tail-call.

Some observations below.

quoted

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
v2: Explain why using ____ version instead of __
---
 net/core/filter.c | 34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/net/core/filter.c b/net/core/filter.c
index 4a50fe2cd863..fa8a7c7d86eb 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c

@@ -7760,6 +7760,38 @@ static const struct bpf_func_proto bpf_sk_assign_proto = {
      .arg3_type      = ARG_ANYTHING,
 };

+BPF_CALL_4(bpf_sock_ops_skb_load_bytes, struct bpf_sock_ops_kern *, bpf_sock,
+        u32, offset, void *, to, u32, len)
+{
+     int err;
+
+     if (bpf_sock->op != BPF_SOCK_OPS_RCVQ_CB) {

bpf_dynptr_from_skb() and bpf_dynptr_slice() kfunc could also be considered.
One less bpf_sock->op check in filter.c to maintain and could also avoid
a data copy. There is a bpf_cast_to_kern_ctx() to get to a trusted
skops_kern pointer but this will need changes in verifier.c to get to
skops_kern->skb (e.g. in type_is_trusted_or_null) and this is the tradeoff.

Maybe a dumb question, but does it add extra cost (extra dynptr
function call?) if data overlaps two frags, or can dynptr handle it
seamlessly with a single bpf_dynptr_slice() ?

In our case, the data copy is ~16 bytes, so the cost will not be
a big problem I think.

If this new rcvq callback is added to the 'bpf_tcp_ops' proposal [1],
all this will go away. 'struct sk_buff *skb' can be directly passed to an
ops of the 'bpf_tcp_ops'. Supporting '*skb' in a struct_ops has already
been done in the bpf_qdisc.

[1]: https://lore.kernel.org/bpf/20260519215841.2984970-11-martin.lau@linux.dev/ (local)

Oh I missed the series, the struct_ops conversion looks nice !
Since this work isn't urgent, I can wait for your series if mine
churns it.

Jason's series is adding a new op, and I guess this can be
integrated too ?
https://lore.kernel.org/bpf/20260521135244.40869-5-kerneljasonxing@gmail.com/ (local)

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help