RE: [Patch bpf-next v7 09/13] udp: implement ->read_sock() for sockmap
From: John Fastabend <john.fastabend@gmail.com>
Date: 2021-03-29 20:55:08
Also in:
bpf
Cong Wang wrote:
quoted hunk ↗ jump to hunk
From: Cong Wang <redacted> This is similar to tcp_read_sock(), except we do not need to worry about connections, we just need to retrieve skb from UDP receive queue. Note, the return value of ->read_sock() is unused in sk_psock_verdict_data_ready(). Cc: John Fastabend <john.fastabend@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jakub Sitnicki <jakub@cloudflare.com> Cc: Lorenz Bauer <redacted> Signed-off-by: Cong Wang <redacted> --- include/net/udp.h | 2 ++ net/ipv4/af_inet.c | 1 + net/ipv4/udp.c | 35 +++++++++++++++++++++++++++++++++++ net/ipv6/af_inet6.c | 1 + 4 files changed, 39 insertions(+)diff --git a/include/net/udp.h b/include/net/udp.h index df7cc1edc200..347b62a753c3 100644 --- a/include/net/udp.h +++ b/include/net/udp.h@@ -329,6 +329,8 @@ struct sock *__udp6_lib_lookup(struct net *net, struct sk_buff *skb); struct sock *udp6_lib_lookup_skb(const struct sk_buff *skb, __be16 sport, __be16 dport); +int udp_read_sock(struct sock *sk, read_descriptor_t *desc, + sk_read_actor_t recv_actor); /* UDP uses skb->dev_scratch to cache as much information as possible and avoid * possibly multiple cache miss on dequeue()diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c index 1355e6c0d567..f17870ee558b 100644 --- a/net/ipv4/af_inet.c +++ b/net/ipv4/af_inet.c@@ -1070,6 +1070,7 @@ const struct proto_ops inet_dgram_ops = { .setsockopt = sock_common_setsockopt, .getsockopt = sock_common_getsockopt, .sendmsg = inet_sendmsg, + .read_sock = udp_read_sock, .recvmsg = inet_recvmsg, .mmap = sock_no_mmap, .sendpage = inet_sendpage,diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 38952aaee3a1..04620e4d64ab 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c@@ -1782,6 +1782,41 @@ struct sk_buff *__skb_recv_udp(struct sock *sk, unsigned int flags, } EXPORT_SYMBOL(__skb_recv_udp); +int udp_read_sock(struct sock *sk, read_descriptor_t *desc, + sk_read_actor_t recv_actor) +{ + int copied = 0; + + while (1) { + int offset = 0, err;
Should this be int offset = sk_peek_offset()? MSG_PEEK should work from recv side, at least it does on TCP side. If its handled in some following patch a comment would be nice. I was just reading udp_recvmsg() so maybe its not needed.
+ struct sk_buff *skb;
+
+ skb = __skb_recv_udp(sk, 0, 1, &offset, &err);
+ if (!skb)
+ return err;
+ if (offset < skb->len) {
+ size_t len;
+ int used;
+
+ len = skb->len - offset;
+ used = recv_actor(desc, skb, offset, len);
+ if (used <= 0) {
+ if (!copied)
+ copied = used;
+ break;
+ } else if (used <= len) {
+ copied += used;
+ offset += used;The while loop is going to zero this? What are we trying to do here with offset?
quoted hunk ↗ jump to hunk
+ } + } + if (!desc->count) + break; + } + + return copied; +} +EXPORT_SYMBOL(udp_read_sock); + /* * This should be easy, if there is something there we * return it, otherwise we block.diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c index 802f5111805a..71de739b4a9e 100644 --- a/net/ipv6/af_inet6.c +++ b/net/ipv6/af_inet6.c@@ -714,6 +714,7 @@ const struct proto_ops inet6_dgram_ops = { .getsockopt = sock_common_getsockopt, /* ok */ .sendmsg = inet6_sendmsg, /* retpoline's sake */ .recvmsg = inet6_recvmsg, /* retpoline's sake */ + .read_sock = udp_read_sock, .mmap = sock_no_mmap, .sendpage = sock_no_sendpage, .set_peek_off = sk_set_peek_off,-- 2.25.1