Re: [PATCH net v4 5/8] net/sched: netem: batch-transfer ready packets to avoid child re-entrancy
From: Simon Horman <horms@kernel.org>
Date: 2026-04-10 11:37:30
Also in:
lkml
This is an AI-generated review of your patch. The human sending this email has considered the AI review valid, or at least plausible. For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html --- net/sched: netem: batch-transfer ready packets to avoid child re-entrancy netem_dequeue_child() previously transferred one packet from the tfifo to the child qdisc per dequeue call. Parents like HFSC that track class active/inactive state on qlen transitions could see an enqueue during dequeue, causing double-insertion into the eltree (CVE-2025-37890, CVE-2025-38001). Non-work-conserving children like TBF could also refuse to return a just-enqueued packet, making netem return NULL despite having backlog, which causes parents like DRR to incorrectly deactivate the class. Move all time-ready packets into the child before calling its dequeue. This separates the enqueue and dequeue phases so the parent sees consistent qlen transitions.
quoted hunk ↗ jump to hunk
diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c index e264f7aefb97..b93f0e886a2b 100644 --- a/net/sched/sch_netem.c +++ b/net/sched/sch_netem.c
[ ... ]
quoted hunk ↗ jump to hunk
@@ -743,31 +745,28 @@ static struct sk_buff *netem_dequeue_child(struct Qdisc *sch) u64 now = ktime_get_ns(); struct sk_buff *skb; - skb = netem_peek(q); - if (skb) { - u64 time_to_send = netem_skb_cb(skb)->time_to_send; - - if (q->slot.slot_next && q->slot.slot_next < time_to_send) - get_slot_next(q, now); - - if (time_to_send <= now && q->slot.slot_next <= now) { - struct sk_buff *to_free = NULL; - unsigned int pkt_len; - int err; + while ((skb = netem_peek(q)) != NULL) { + struct sk_buff *to_free = NULL; + unsigned int pkt_len; + int err; - skb = netem_pull_tfifo(q, sch); - netem_slot_account(q, skb, now); + if (netem_skb_cb(skb)->time_to_send > now) + break; + if (q->slot.slot_next && q->slot.slot_next > now) + break;
Does the batch transfer preserve slot gap behavior?
The old code called get_slot_next(q, now) when a slot expired by time:
if (q->slot.slot_next && q->slot.slot_next < time_to_send)
get_slot_next(q, now);
This advanced the slot to a new future window (slot_next = now + next_delay),
enforcing inter-slot gaps when a slot expired by time rather than by packet
or byte budget exhaustion.
The new batching loop only checks 'q->slot.slot_next > now' to break, and
relies solely on netem_slot_account() -> get_slot_next() for slot advancement
(which only triggers when packets_left or bytes_left reach zero).
When a slot expires by time but still has remaining budget, the new code
transfers packets immediately without creating a new inter-slot gap.
netem_dequeue_direct() still has the old slot gap logic:
if (q->slot.slot_next && q->slot.slot_next < time_to_send)
get_slot_next(q, now);
Should netem_dequeue_child() call get_slot_next() before the batching loop
to maintain consistent slot-based traffic shaping semantics between the
direct and child paths?
- pkt_len = qdisc_pkt_len(skb);
- err = qdisc_enqueue(skb, q->qdisc, &to_free);
- kfree_skb_list(to_free);
- if (err != NET_XMIT_SUCCESS) {
- if (net_xmit_drop_count(err))
- qdisc_qstats_drop(sch);
- sch->qstats.backlog -= pkt_len;
- sch->q.qlen--;
- qdisc_tree_reduce_backlog(sch, 1, pkt_len);
- }
+ skb = netem_pull_tfifo(q, sch);
+ netem_slot_account(q, skb, now);
+
+ pkt_len = qdisc_pkt_len(skb);
+ err = qdisc_enqueue(skb, q->qdisc, &to_free);
+ kfree_skb_list(to_free);
+ if (unlikely(err != NET_XMIT_SUCCESS)) {
+ if (net_xmit_drop_count(err))
+ qdisc_qstats_drop(sch);
+ sch->qstats.backlog -= pkt_len;
+ sch->q.qlen--;
+ qdisc_tree_reduce_backlog(sch, 1, pkt_len);
}
}