Thread (6 messages) 6 messages, 2 authors, 2026-01-18

Re: [PATCH] net: fix udp gso skb_segment after pull from frag_list

From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Date: 2026-01-18 02:28:08
Also in: lkml

Shiming Cheng (成诗明) wrote:
Dear Willem

BUG_ON(!list_skb->head_frag);   <<<<<<<< an exception was triggered
here.

trigger exception condition:
1. TX device scatter-gather: on
dev features=18433 => 100100000000001  => last bit is 1 == SG enable 
2. skb->head_frag is 1
3. udp gro enable


Because the target device has enabled SG (scatter-gather), and skb-
quoted
head_frag is 0, the subsequent head cannot be added to frags.
In other words, when head_frag is 0, the target device is not allowed
to enable SG (scatter-gather).
I  still need to reproduce the problem, but from analyzing the code, it
Do you mean that this happened somewhere (e.g., in production), but
you are not able to trigger it in a synthetic test yet?

(btw small reminder: please don't top post)
appears that the issue is caused by the target device enabling SG. 
SG is the default. It makes sense that disabling SG disables
skb frags and avoids this part of the code.
 
struct sk_buff *skb_segment(struct sk_buff *head_skb,
                  
          netdev_features_t features)
{
...
        while (pos < offset + len) {
                        if (i >= nfrags) {
                                i = 0;
                                nfrags = skb_shinfo(list_skb)-
quoted
nr_frags;
                                frag = skb_shinfo(list_skb)->frags;
                                frag_skb = list_skb;
                                if (!skb_headlen(list_skb)) {
                                        BUG_ON(!nfrags);
                                } else {
                                        BUG_ON(!list_skb-
quoted
head_frag);   <<<<<<<< an exception was triggered here.
                                        /* to make room for head_frag.
*/
                                        i--;
                                        frag--;
Thanks. So commit 13acc94eff12 ("net: permit skb_segment on head_frag
frag_list skb") extended support for segmenting of list_skb with
frag_list, but only if the head if present is also frag_based.

This is obsure enough that I agree that linearizing such packets is an
okay workaround.

But we may be able to do one better. Disabling sg for this input will
build linear nskbs and avoid this code. At the top of skb_segment.

And such code already exists, refined most recently in commit
9e4b7a99a03a ("net: gso: fix panic on frag_list with mixed head alloc
types"). Did you observe this on a recent kernel? If not, which
version?

What kind of BPF program are you running, especially which functions
does it call that might change skb geometry. All such functions in
net/core/filter. set SKB_GSO_DODGY, which will make skb_segment enter
this check branch.


Re: your patch

+	} else if (skb_shinfo(gso_skb)->frag_list && gso_skb->head_frag == 0) {

The issue here is that list_skb->head_frag == 0, not gso_skb. Not sure
that gso_skb->head_frag == 0 implies list_skb->head_frag == 0.


+		if (skb_pagelen(gso_skb) - sizeof(*uh) != skb_shinfo(gso_skb)->gso_size) {

This is copied but AFAIK not relevant here.
                                }
                                if (skb_orphan_frags(frag_skb,
GFP_ATOMIC) ||
                                    skb_zerocopy_clone(nskb, frag_skb,
                                                       GFP_ATOMIC))
                                        goto err;

                                list_skb = list_skb->next;
                        }

}

#0  0xffffffa17e5b4184 in skb_segment (head_skb=<optimized out>,
features=<optimized out>)
    at ../../../../../../kernel-4.19/net/core/skbuff.c:3774
#1  0xffffffa17e68a4b0 in __udp_gso_segment
(gso_skb=0xffffffde2973fe00, features=18433)
    at ../../../../../../kernel-4.19/net/ipv4/udp_offload.c:215
#2  0xffffffa17e68ad90 in udp4_ufo_fragment (skb=0xffffffde2973fe00,
features=18433)
    at ../../../../../../kernel-4.19/net/ipv4/udp_offload.c:316
#3  0xffffffa17e694e04 in inet_gso_segment (skb=0xffffffde2973fe00,
features=18433)
    at ../../../../../../kernel-4.19/net/ipv4/af_inet.c:1342
#4  0xffffffa17e5c4b48 in skb_mac_gso_segment (skb=0xffffffde2973fe00,
features=18433)


On Wed, 2026-01-07 at 11:00 -0500, Willem de Bruijn wrote:
quoted
External email : Please do not click links or open attachments until
you have verified the sender or the content.


Shiming Cheng (成诗明) wrote:
quoted
Dear Willem

frag_list assignment trigger:
frag_list is assigned to the next skb without to enable device
feature
NETIF_F_GRO_FRAGLIST if head_frag is zero.
Thanks for the trace, so this is a regular GRO packet using
skb_gro_receive that ended up having to use frag_list.
quoted
After packet in fraglist is processed by BPF pull tial, performing
GSO
segmentation directly will cause problems.
That's peculiar. For such packets, there is no expectation that all
frag_list skbs hold an exact segment.

It would be helpful if the stack trace can be extended to show the
line numbers. And/or if you can show the BUG that is being hit in
skb_segment.

I agree that we probably just need to detect such modified packets
and
linearize them. But let's get a more detailed root cause first.
quoted
udp_gro_receive_segment
  skb_gro_receive
   if (skb->head_frag==0)
     goto merge;


merge:
        /* sk ownership - if any - completely transferred to the
aggregated packet */
        skb->destructor = NULL;
        skb->sk = NULL;
        delta_truesize = skb->truesize;
        if (offset > headlen) {
                unsigned int eat = offset - headlen;

                skb_frag_off_add(&skbinfo->frags[0], eat);
                skb_frag_size_sub(&skbinfo->frags[0], eat);
                skb->data_len -= eat;
                skb->len -= eat;
                offset = headlen;
        }

        __skb_pull(skb, offset);

        if (NAPI_GRO_CB(p)->last == p)
                skb_shinfo(p)->frag_list = skb;
        <<< here frag_list is assigned to the next skb without to
enable device feature NETIF_F_GRO_FRAGLIST if head_frag is zero
        else
                NAPI_GRO_CB(p)->last->next = skb;
        NAPI_GRO_CB(p)->last = skb;
        __skb_header_release(skb);
        lp = p;




On Tue, 2026-01-06 at 14:15 -0500, Willem de Bruijn wrote:
quoted
External email : Please do not click links or open attachments
until
you have verified the sender or the content.


Shiming Cheng wrote:
quoted
Commit 3382a1ed7f77 ("net: fix udp gso skb_segment after  pull
from
frag_list")
if gso_type is not SKB_GSO_FRAGLIST but skb->head_frag is zero,
What codepath triggers this scenario?

We should make sure that the fix covers all such instances.
Likely
instances of where some module in the datapath, like a BPF
program,
modifies a valid skb into one that is not safe to pass to
skb_segment.

I don't fully understand yet that skb->head_frag == 0 is the only
such condition in scope.
quoted
then detected invalid geometry in frag_list skbs and call
skb_segment. But some packets with modified geometry can also
hit
bugs in that code. Instead, linearize all these packets that
fail
the basic invariants on gso fraglist skbs. That is more robust.
call stack information, see below.

Valid SKB_GSO_FRAGLIST skbs
- consist of two or more segments
- the head_skb holds the protocol headers plus first gso_size
- one or more frag_list skbs hold exactly one segment
- all but the last must be gso_size

Optional datapath hooks such as NAT and BPF (bpf_skb_pull_data)
can
modify fraglist skbs, breaking these invariants.

In extreme cases they pull one part of data into skb linear.
For
UDP,
this  causes three payloads with lengths of (11,11,10) bytes
were
pulled tail to become (12,10,10) bytes.

The skbs no longer meets the above SKB_GSO_FRAGLIST conditions
because
payload was pulled into head_skb, it needs to be linearized
before
pass
to regular skb_segment.
Most of this commit message duplicates the text in commit
3382a1ed7f77
("net: fix udp gso skb_segment after  pull from frag_list"). And
somewhat garbles it, as in the first sentence.

But this is a different datapath, not related to
SKB_GSO_FRAGLIST.
So the fixes tag is also incorrect. The blamed commit fixes an
issue
with fraglist GRO. This new issue is with skbs that have a
fraglist,
but not one created with that feature. (the naming is confusing,
but
fraglist-gro is only one use of the skb frag_list).
quoted
  skb_segment+0xcd0/0xd14
  __udp_gso_segment+0x334/0x5f4
  udp4_ufo_fragment+0x118/0x15c
  inet_gso_segment+0x164/0x338
  skb_mac_gso_segment+0xc4/0x13c
  __skb_gso_segment+0xc4/0x124
  validate_xmit_skb+0x9c/0x2c0
  validate_xmit_skb_list+0x4c/0x80
  sch_direct_xmit+0x70/0x404
  __dev_queue_xmit+0x64c/0xe5c
  neigh_resolve_output+0x178/0x1c4
  ip_finish_output2+0x37c/0x47c
  __ip_finish_output+0x194/0x240
  ip_finish_output+0x20/0xf4
  ip_output+0x100/0x1a0
  NF_HOOK+0xc4/0x16c
  ip_forward+0x314/0x32c
  ip_rcv+0x90/0x118
  __netif_receive_skb+0x74/0x124
  process_backlog+0xe8/0x1a4
  __napi_poll+0x5c/0x1f8
  net_rx_action+0x154/0x314
  handle_softirqs+0x154/0x4b8

  [118.376811] [C201134] rxq0_pus: [name:bug&]kernel BUG at
net/core/skbuff.c:4278!
  [118.376829] [C201134] rxq0_pus: [name:traps&]Internal error:
Oops - BUG: 00000000f2000800 [#1]
  [118.470774] [C201134] rxq0_pus: [name:mrdump&]Kernel Offset:
0x178cc00000 from 0xffffffc008000000
  [118.470810] [C201134] rxq0_pus: [name:mrdump&]PHYS_OFFSET:
0x40000000
  [118.470827] [C201134] rxq0_pus: [name:mrdump&]pstate:
60400005
(nZCv daif +PAN -UAO)
  [118.470848] [C201134] rxq0_pus: [name:mrdump&]pc :
[0xffffffd79598aefc] skb_segment+0xcd0/0xd14
  [118.470900] [C201134] rxq0_pus: [name:mrdump&]lr :
[0xffffffd79598a5e8] skb_segment+0x3bc/0xd14
  [118.470928] [C201134] rxq0_pus: [name:mrdump&]sp :
ffffffc008013770

Fixes: 3382a1ed7f77 ("net: fix udp gso skb_segment after pull
from
frag_list")
Signed-off-by: Shiming Cheng <redacted>
---
 net/ipv4/udp_offload.c | 6 ++++++
 1 file changed, 6 insertions(+)
diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
index 19d0b5b09ffa..606d9ce8c98e 100644
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -535,6 +535,12 @@ struct sk_buff *__udp_gso_segment(struct
sk_buff *gso_skb,
                      uh->check = ~udp_v4_check(gso_skb->len,
                                                ip_hdr(gso_skb)
-
quoted
saddr,
                                                ip_hdr(gso_skb)
-
quoted
daddr, 0);
+     } else if (skb_shinfo(gso_skb)->frag_list && gso_skb-
quoted
head_frag == 0) {
+             if (skb_pagelen(gso_skb) - sizeof(*uh) !=
skb_shinfo(gso_skb)->gso_size) {
+                     ret = __skb_linearize(gso_skb);
+                     if (ret)
+                             return ERR_PTR(ret);
+             }
      }

      skb_pull(gso_skb, sizeof(*uh));
--
2.45.2
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help