Re: [RFC PATCH net-next] net: pktgen: packet bursting via skb->xmit_more
From: Jesper Dangaard Brouer <hidden>
Date: 2014-09-26 08:05:50
On Thu, 25 Sep 2014 17:46:22 -0700 Alexei Starovoitov [off-list ref] wrote:
This patch demonstrates the effect of delaying update of HW tailptr.
(based on earlier patch by Jesper)
burst=1 is a default. It sends one packet with xmit_more=false
burst=2 sends one packet with xmit_more=true and
2nd copy of the same packet with xmit_more=false
burst=3 sends two copies of the same packet with xmit_more=true and
3rd copy with xmit_more=false
Performance with ixgbe:
usec 30:
burst=1 tx:9.2 Mpps
burst=2 tx:13.6 Mpps
burst=3 tx:14.5 Mpps full 10G line ratePerfect, full wirespeed! :-)
usec 1 (default): burst=1,4,100 tx:3.9 Mpps
Here you are being limited by the TX ring queue cleanup, being too slow. As desc here: http://netoptimizer.blogspot.dk/2014/06/pktgen-for-network-overload-testing.html
usec 0: burst=1 tx:4.9 Mpps burst=2 tx:6.6 Mpps burst=3 tx:7.9 Mpps burst=4 tx:8.7 Mpps burst=8 tx:10.3 Mpps burst=128 tx:12.4 Mpps Cc: Jesper Dangaard Brouer <redacted> Signed-off-by: Alexei Starovoitov <redacted> ---
Acked-by: Jesper Dangaard Brouer <redacted>
tx queue size, irq affinity left in default. pause frames are off. Nice to finally see line rate generated by one cpu
Yes,
Comparing to Jesper patch this one amortizes the cost of spin_lock and atomic_inc by doing HARD_TX_LOCK and atomic_add(N) once across N packets.
Nice additional optimizations :-)
quoted hunk ↗ jump to hunk
net/core/pktgen.c | 33 ++++++++++++++++++++++++++++++--- 1 file changed, 30 insertions(+), 3 deletions(-)diff --git a/net/core/pktgen.c b/net/core/pktgen.c index 5c728aa..47557ba 100644 --- a/net/core/pktgen.c +++ b/net/core/pktgen.c@@ -387,6 +387,7 @@ struct pktgen_dev { u16 queue_map_min; u16 queue_map_max; __u32 skb_priority; /* skb priority field */ + int burst; /* number of duplicated packets to burst */ int node; /* Memory node */ #ifdef CONFIG_XFRM
[...]
quoted hunk ↗ jump to hunk
@@ -3299,7 +3313,8 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev) { struct net_device *odev = pkt_dev->odev; struct netdev_queue *txq; - int ret; + int burst_cnt, ret; + bool more; /* If device is offline, then don't send */ if (unlikely(!netif_running(odev) || !netif_carrier_ok(odev))) {@@ -3347,8 +3362,14 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev) pkt_dev->last_ok = 0; goto unlock; } - atomic_inc(&(pkt_dev->skb->users)); - ret = netdev_start_xmit(pkt_dev->skb, odev, txq, false); + atomic_add(pkt_dev->burst, &pkt_dev->skb->users); + + burst_cnt = 0; + +xmit_more: + more = ++burst_cnt < pkt_dev->burst; + + ret = netdev_start_xmit(pkt_dev->skb, odev, txq, more); switch (ret) { case NETDEV_TX_OK:@@ -3356,6 +3377,8 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev) pkt_dev->sofar++; pkt_dev->seq_num++; pkt_dev->tx_bytes += pkt_dev->last_pkt_size; + if (more) + goto xmit_more;
I think this will break my VLAN hack mode, that allows me to shoot pktgen after the qdisc layer, but I'm okay with that, as I can just avoid using this new burst mode and then it will still work for me.
quoted hunk ↗ jump to hunk
break; case NET_XMIT_DROP: case NET_XMIT_CN:@@ -3374,6 +3397,9 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev) atomic_dec(&(pkt_dev->skb->users)); pkt_dev->last_ok = 0; } + + if (unlikely(pkt_dev->burst - burst_cnt > 0)) + atomic_sub(pkt_dev->burst - burst_cnt, &pkt_dev->skb->users); unlock: HARD_TX_UNLOCK(odev, txq);
-- Best regards, Jesper Dangaard Brouer MSc.CS, Sr. Network Kernel Developer at Red Hat Author of http://www.iptv-analyzer.org LinkedIn: http://www.linkedin.com/in/brouer