Thread (47 messages) 47 messages, 6 authors, 2010-09-29

Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel

From: "Michael S. Tsirkin" <mst@redhat.com>
Date: 2010-09-29 08:34:39
Also in: kvm, lkml

On Wed, Sep 29, 2010 at 10:16:45AM +0200, Michael S. Tsirkin wrote:
On Tue, Sep 28, 2010 at 08:24:29PM -0700, Shirley Ma wrote:
quoted
Hello Michael,

On Wed, 2010-09-15 at 07:52 -0700, Shirley Ma wrote:
quoted
quoted
quoted
 Don't you think once I address vhost_add_used_and_signal update
issue, it is a simple and complete patch for macvtap TX zero copy?

Thanks
Shirley
I like the fact that the patch is simple. Unfortunately
I suspect it'll stop being simple by the time it's complete :) 
I can make a try. :)
I compared several approaches for addressing the issue being raised here
on how/when to update vhost_add_used_and_signal. The simple approach I
have found is:

1. Adding completion field in struct virtqueue;
2. when it is a zero copy packet, put vhost thread wait for completion
to update vhost_add_used_and_signal;
3. passing vq from vhost to macvtap as skb destruct_arg;
4. when skb is freed for the last reference, signal vq completion
The test results show same performance as the original patch. How do you
think? If it sounds good to you. I will resubmit this reversion patch.
The patch still keeps as simple as it was before. :)

Thanks
Shirley
If you look at dev_hard_start_xmit you will see a call
to skb_orphan_try which often calls the skb destructor.
So I suspect this is almost equivalent to your original patch,
and has the same correctness issue.
So you could try doing skb_tx(skb)->prevent_sk_orphan = 1
just to see what will happen. Might be interesting - just
make sure the device doesn't orphan the skb first thing.
I suspect lack of parallelism will result in bad throughput
esp for small messages.

Note this still won't make it correct (this has module unloading
issue, and devices might still orphan skb, clone it, or hang on to
paged data in some other way) but at least closer.

I think you should try testing with guest to external communication,
this will uncover some of these correctness issues for you.
I think netperf also has some flag to check data, might
be a good idea to use it for testing.
-- 
MST
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help