Re: [PATCH take 2] pkt_sched: Fix qdisc_watchdog() vs. dev_deactivate() race
From: David Miller <davem@davemloft.net>
Date: 2008-09-20 07:21:49
From: Jarek Poplawski <redacted> Date: Sun, 14 Sep 2008 22:27:15 +0200 [ I'm just picking one posting out of several in this thread. ]
Anyway, the main problem here was a high cpu load despite stopped queue. Are you sure this peek, which is almost full dequeue, can really help for this? BTW, since after current fix there were no later complains I guess it's just about full netif_stop or non-mq device.
I think we are overengineering this situation. Let's look at what actually matters for cpu utilization. These __qdisc_run() things are invoked in two situations where we might block on the hw queue being stopped: 1) When feeding packets into the qdisc in dev_queue_xmit(). Guess what? We _know_ the queue this packet is going to hit. The only new thing we can possible trigger and be interested in at this specific point is if _this_ packet can be sent at this time. And we can check that queue mapping after the qdisc_enqueue_root() call, so that multiq aware qdiscs can have made their changes. 2) When waking up a queue. And here we should schedule the qdisc_run _unconditionally_. If the queue was full, it is extremely likely that new packets are bound for that device queue. There is no real savings to be had by doing this peek/requeue/dequeue stuff. The cpu utilization savings exist for case #1 only, and we can implement the bypass logic _perfectly_ as described above. For #2 there is nothing to check, just do it and see what comes out of the qdisc. I would suggest adding an skb pointer argument to qdisc_run(). If it's NULL, unconditionally schedule __qdisc_run(). Else, only schedule if the TX queue indicated by skb_queue_mapping() is not stopped. dev_queue_xmit() will use the "pass the skb" case, but only if qdisc_enqueue_root()'s return value doesn't indicate that there is a potential drop. On potential drop, we'll pass NULL to make sure we don't potentially reference a free'd SKB. The other case in net_tx_action() can always pass NULL to qdisc_run().