Re: [net-next PATCH 1/1 V4] qdisc: bulk dequeue support for qdiscs with TCQ_F_ONETXQUEUE
From: Tom Herbert <hidden>
Date: 2014-09-25 15:05:40
On Thu, Sep 25, 2014 at 7:57 AM, Jesper Dangaard Brouer [off-list ref] wrote:
On Thu, 25 Sep 2014 07:40:33 -0700 Tom Herbert [off-list ref] wrote:quoted
A few test results in patch 0 are good. I like to have results for with and without patch. These should two things: 1) Any regressions caused by the patch 2) Performance gains (in that order of importance :-) ). There doesn't need to be a lot here, just something reasonably representative, simple, and should be easily reproducible. My expectation in bulk dequeue is that we should see no obvious regression and hopefully an improvement in CPU utilization-- are you able to verify this?We are saving 3% CPU, as I described in my post with subject: "qdisc/UDP_STREAM: measuring effect of qdisc bulk dequeue": http://thread.gmane.org/gmane.linux.network/331152/focus=331154 Using UDP_STREAM on 1Gbit/s driver igb, I can show that the _raw_spin_lock calls are reduced with approx 3%, when enabling bulking of just 2 packets.
That's great. In commit log, would be good to have results with TCP_STREAM also and please report aggregate CPU utilization changes (like from mpstat). Thanks, Tom
This test can only demonstrates a CPU usage reduction, as the
throughput is already at maximum link (bandwidth) capacity.
Notice netperf option "-m 1472" which makes sure we are not sending
UDP IP-fragments::
netperf -H 192.168.111.2 -t UDP_STREAM -l 120 -- -m 1472
Results from perf diff::
# Command: perf diff
# Event 'cycles'
# Baseline Delta Symbol
# no-bulk bulk(1)
# ........ ....... .........................................
#
7.05% -3.03% [k] _raw_spin_lock
6.34% +0.23% [k] copy_user_enhanced_fast_string
6.30% +0.26% [k] fib_table_lookup
3.03% +0.01% [k] __slab_free
3.00% +0.08% [k] intel_idle
2.49% +0.05% [k] sock_alloc_send_pskb
2.31% +0.30% netperf [.] send_omni_inner
2.12% +0.12% netperf [.] send_data
2.11% +0.10% [k] udp_sendmsg
1.96% +0.02% [k] __ip_append_data
1.48% -0.01% [k] __alloc_skb
1.46% +0.07% [k] __mkroute_output
1.34% +0.05% [k] __ip_select_ident
1.29% +0.03% [k] check_leaf
1.27% +0.09% [k] __skb_get_hash
A nitpick is that, this testing were done on V2 of the patchset.
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Sr. Network Kernel Developer at Red Hat
Author of http://www.iptv-analyzer.org
LinkedIn: http://www.linkedin.com/in/brouer