Thread (5 messages) 5 messages, 4 authors, 20d ago

Re: [PATCH v3] bpf, sockmap: keep sk_msg copy state in sync

From: John Fastabend <john.fastabend@gmail.com>
Date: 2026-05-20 16:59:17
Also in: bpf, lkml, stable

On Wed, May 20, 2026 at 11:09:36AM +0000, bot+bpf-ci@kernel.org wrote:
quoted
diff --git a/net/core/filter.c b/net/core/filter.c
index 9590877b0714..223d78708486 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
[ ... ]
quoted
@@ -2794,6 +2835,8 @@ BPF_CALL_4(bpf_msg_push_data, struct sk_msg *, msg, u32, start,
 {
 	struct scatterlist sge, nsge, nnsge, rsge = {0}, *psge;
 	u32 new, i = 0, l = 0, space, copy = 0, offset = 0;
+	bool sge_copy = false, nsge_copy = false, nnsge_copy = false;
+	bool rsge_copy = false;
 	u8 *raw, *to, *from;
 	struct page *page;
@@ -2866,6 +2909,7 @@ BPF_CALL_4(bpf_msg_push_data, struct sk_msg *, msg, u32, start,
 			sk_msg_iter_var_prev(i);
 		psge = sk_msg_elem(msg, i);
 		rsge = sk_msg_elem_cpy(msg, i);
+		rsge_copy = sk_msg_elem_is_copy(msg, i);

 		psge->length = start - offset;
 		rsge.length -= psge->length;
Zhang,

Did you want to push two smallish patches to fix these two? I have had
similar patches on my test system for a few days so they LGTM. These
two fixes
   page = alloc_pages(__GFP_NOWARN | GFP_ATOMIC |
                      __GFP_COMP | __GFP_ZERO,
                      get_order(copy + len));
and
   rsge.offset += start - offset;
bot must be reading reviews because I called that out in v2.

[...]
Could the ring buffer state become corrupted when data is appended exactly
at the end of the message and the ring buffer is full?

If data is appended exactly at the end of the message (start == offset +
l), the loop finishes with i pointing to the unused slot msg->sg.end. If
the ring buffer is full (!space), the coalescing fallback logic reads:

   copy = msg->sg.data[i].length

Because i points to an unused slot, its length is 0, making copy == 0, and
the fallback coalescing block is bypassed. The function then
unconditionally calls:

   sk_msg_iter_next(msg, end)

which increments msg->sg.end. Since the ring buffer was already full,
advancing msg->sg.end causes it to wrap around and equal msg->sg.start.
This could corrupt the ring buffer state and overwrite the first element
without calling put_page(), leading to page leaks.
^^^

This one I haven't looked into yet. Let me know if you have time to get
to above two issues this week would be great to get a v1 out at least.

Thanks,
John
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help