--- v6
+++ v5
@@ -10,33 +10,32 @@
This patch series use SKBTX_DEV_ZEROCOPY flags to tell the stack it needs to
know when the skb is freed up. That is the way KVM solved the same problem,
and based on my initial tests it can do the same for us. Avoiding the extra
-copy boosted up TX throughput from 6.8 Gbps to 7.9 (I used a slower AMD
+copy boosted up TX throughput from 6.8 Gbps to 7.9 (I used a slower
Interlagos box, both Dom0 and guest on upstream kernel, on the same NUMA node,
running iperf 2.0.5, and the remote end was a bare metal box on the same 10Gb
switch)
Based on my investigations the packet get only copied if it is delivered to
-Dom0 IP stack through deliver_skb, which is due to this [2] patch. This affects
-DomU->Dom0 IP traffic and when Dom0 does routing/NAT for the guest. That's a bit
-unfortunate, but luckily it doesn't cause a major regression for this usecase.
-In the future we should try to eliminate that copy somehow.
+Dom0 stack, which is due to this [2] patch. That's a bit unfortunate, but
+luckily it doesn't cause a major regression for this usecase. In the future
+we should try to eliminate that copy somehow.
There are a few spinoff tasks which will be addressed in separate patches:
- grant copy the header directly instead of map and memcpy. This should help
us avoiding TLB flushing
- use something else than ballooned pages
- fix grant map to use page->index properly
+I will run some more extensive tests, but some basic XenRT tests were already
+passed with good results.
I've tried to broke it down to smaller patches, with mixed results, so I
welcome suggestions on that part as well:
-1: Use skb->cb to store pending_idx
-2: Some refactoring
-3: Change RX path for mapped SKB fragments (moved here to keep bisectability,
-review it after #4)
-4: Introduce TX grant mapping
-5: Remove old TX grant copy definitons and fix indentations
-6: Add stat counters for zerocopy
-7: Handle guests with too many frags
-8: Add stat counters for frag_list skbs
-9: Timeout packets in RX path
-10: Aggregate TX unmap operations
+1: Introduce TX grant map definitions
+2: Change TX path from grant copy to mapping
+3: Remove old TX grant copy definitons and fix indentations
+4: Change RX path for mapped SKB fragments
+5: Add stat counters for zerocopy
+6: Handle guests with too many frags
+7: Add stat counters for frag_list skbs
+8: Timeout packets in RX path
+9: Aggregate TX unmap operations
v2: I've fixed some smaller things, see the individual patches. I've added a
few new stat counters, and handling the important use case when an older guest
@@ -53,10 +52,6 @@
v5: Only minor fixes based on Wei's comments
-v6: Important bugfixes for xenvif_poll exit path and zerocopy callback, see
-first 2 patches. Also rework of handling packets with too many slots, and
-reorder the series a bit.
-
[1] http://lwn.net/Articles/491522/
[2] https://lkml.org/lkml/2012/7/20/363