Re: [PATCH] eal: fix rte_memcpy perf in hsw/bdw
From: Thomas Monjalon <hidden>
Date: 2016-06-15 14:21:11
From: Thomas Monjalon <hidden>
Date: 2016-06-15 14:21:11
2016-05-24 21:23, Zhihong Wang:
This patch fixes rte_memcpy performance in Haswell and Broadwell for vhost when copy size larger than 256 bytes. It is observed that for large copies like 1024/1518 ones, rte_memcpy suffers high ratio of store buffer full issue which causes pipeline to stall in scenarios like vhost enqueue. This can be alleviated by adjusting instruction layout. Note that this issue may not be visible in micro test. How to reproduce? PHY-VM-PHY using vhost/virtio or vhost/virtio loop back, with large packets like 1024/1518 bytes ones. Make sure packet generation rate is not the bottleneck if PHY-VM-PHY is used. Signed-off-by: Zhihong Wang <redacted>
Test report: http://dpdk.org/ml/archives/dev/2016-May/039716.html Applied, thanks