Re: [PATCH net v4 5/8] net/sched: netem: batch-transfer ready packets to... | netdev

Re: [PATCH net v4 5/8] net/sched: netem: batch-transfer ready packets to avoid child re-entrancy

From: Simon Horman <horms@kernel.org>
Date: 2026-04-10 11:37:30
Also in: lkml

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.

For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
---
net/sched: netem: batch-transfer ready packets to avoid child re-entrancy

netem_dequeue_child() previously transferred one packet from the tfifo
to the child qdisc per dequeue call. Parents like HFSC that track
class active/inactive state on qlen transitions could see an enqueue
during dequeue, causing double-insertion into the eltree
(CVE-2025-37890, CVE-2025-38001). Non-work-conserving children like
TBF could also refuse to return a just-enqueued packet, making netem
return NULL despite having backlog, which causes parents like DRR to
incorrectly deactivate the class.

Move all time-ready packets into the child before calling its dequeue.
This separates the enqueue and dequeue phases so the parent sees
consistent qlen transitions.

quoted hunk ↗ jump to hunk

diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index e264f7aefb97..b93f0e886a2b 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c

[ ... ]

quoted hunk ↗ jump to hunk

@@ -743,31 +745,28 @@ static struct sk_buff *netem_dequeue_child(struct Qdisc *sch)
 	u64 now = ktime_get_ns();
 	struct sk_buff *skb;

-	skb = netem_peek(q);
-	if (skb) {
-		u64 time_to_send = netem_skb_cb(skb)->time_to_send;
-
-		if (q->slot.slot_next && q->slot.slot_next < time_to_send)
-			get_slot_next(q, now);
-
-		if (time_to_send <= now && q->slot.slot_next <= now) {
-			struct sk_buff *to_free = NULL;
-			unsigned int pkt_len;
-			int err;
+	while ((skb = netem_peek(q)) != NULL) {
+		struct sk_buff *to_free = NULL;
+		unsigned int pkt_len;
+		int err;

-			skb = netem_pull_tfifo(q, sch);
-			netem_slot_account(q, skb, now);
+		if (netem_skb_cb(skb)->time_to_send > now)
+			break;
+		if (q->slot.slot_next && q->slot.slot_next > now)
+			break;

Does the batch transfer preserve slot gap behavior?

The old code called get_slot_next(q, now) when a slot expired by time:

    if (q->slot.slot_next && q->slot.slot_next < time_to_send)
        get_slot_next(q, now);

This advanced the slot to a new future window (slot_next = now + next_delay),
enforcing inter-slot gaps when a slot expired by time rather than by packet
or byte budget exhaustion.

The new batching loop only checks 'q->slot.slot_next > now' to break, and
relies solely on netem_slot_account() -> get_slot_next() for slot advancement
(which only triggers when packets_left or bytes_left reach zero).

When a slot expires by time but still has remaining budget, the new code
transfers packets immediately without creating a new inter-slot gap.

netem_dequeue_direct() still has the old slot gap logic:

    if (q->slot.slot_next && q->slot.slot_next < time_to_send)
        get_slot_next(q, now);

Should netem_dequeue_child() call get_slot_next() before the batching loop
to maintain consistent slot-based traffic shaping semantics between the
direct and child paths?

-			pkt_len = qdisc_pkt_len(skb);
-			err = qdisc_enqueue(skb, q->qdisc, &to_free);
-			kfree_skb_list(to_free);
-			if (err != NET_XMIT_SUCCESS) {
-				if (net_xmit_drop_count(err))
-					qdisc_qstats_drop(sch);
-				sch->qstats.backlog -= pkt_len;
-				sch->q.qlen--;
-				qdisc_tree_reduce_backlog(sch, 1, pkt_len);
-			}
+		skb = netem_pull_tfifo(q, sch);
+		netem_slot_account(q, skb, now);
+
+		pkt_len = qdisc_pkt_len(skb);
+		err = qdisc_enqueue(skb, q->qdisc, &to_free);
+		kfree_skb_list(to_free);
+		if (unlikely(err != NET_XMIT_SUCCESS)) {
+			if (net_xmit_drop_count(err))
+				qdisc_qstats_drop(sch);
+			sch->qstats.backlog -= pkt_len;
+			sch->q.qlen--;
+			qdisc_tree_reduce_backlog(sch, 1, pkt_len);
 		}
 	}

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help