Re: [PATCH 2/6] vhost_net: use vhost_add_used_and_signal_n() in... | netdev

[PATCH 0/6] vhost code cleanup and minor enhancement · Jason Wang <jasowang@redhat.com> · 2013-08-16
[PATCH 1/6] vhost_net: make vhost_zerocopy_signal_used() returns void · Jason Wang <jasowang@redhat.com> · 2013-08-16
[PATCH 2/6] vhost_net: use vhost_add_used_and_signal_n() in vhost_zerocopy_signal_used() · Jason Wang <jasowang@redhat.com> · 2013-08-16
Re: [PATCH 2/6] vhost_net: use vhost_add_used_and_signal_n() in vhost_zerocopy_signal_used() · "Michael S. Tsirkin" <mst@redhat.com> · 2013-08-16
Re: [PATCH 2/6] vhost_net: use vhost_add_used_and_signal_n() in vhost_zerocopy_signal_used() · Jason Wang <jasowang@redhat.com> · 2013-08-20
Re: [PATCH 2/6] vhost_net: use vhost_add_used_and_signal_n() in vhost_zerocopy_signal_used() · Jason Wang <jasowang@redhat.com> · 2013-08-23
Re: [PATCH 2/6] vhost_net: use vhost_add_used_and_signal_n() in vhost_zerocopy_signal_used() · "Michael S. Tsirkin" <mst@redhat.com> · 2013-08-25
[PATCH 3/6] vhost: switch to use vhost_add_used_n() · Jason Wang <jasowang@redhat.com> · 2013-08-16
Re: [PATCH 3/6] vhost: switch to use vhost_add_used_n() · "Michael S. Tsirkin" <mst@redhat.com> · 2013-08-16
Re: [PATCH 3/6] vhost: switch to use vhost_add_used_n() · Jason Wang <jasowang@redhat.com> · 2013-08-20
[PATCH 4/6] vhost_net: determine whether or not to use zerocopy at one time · Jason Wang <jasowang@redhat.com> · 2013-08-16
[PATCH 5/6] vhost_net: poll vhost queue after marking DMA is done · Jason Wang <jasowang@redhat.com> · 2013-08-16
Re: [PATCH 5/6] vhost_net: poll vhost queue after marking DMA is done · "Michael S. Tsirkin" <mst@redhat.com> · 2013-08-16
Re: [PATCH 5/6] vhost_net: poll vhost queue after marking DMA is done · Jason Wang <jasowang@redhat.com> · 2013-08-20
[PATCH 6/6] vhost_net: remove the max pending check · Jason Wang <jasowang@redhat.com> · 2013-08-16
Re: [PATCH 6/6] vhost_net: remove the max pending check · "Michael S. Tsirkin" <mst@redhat.com> · 2013-08-16
Re: [PATCH 6/6] vhost_net: remove the max pending check · Jason Wang <jasowang@redhat.com> · 2013-08-20
Re: [PATCH 6/6] vhost_net: remove the max pending check · Jason Wang <jasowang@redhat.com> · 2013-08-23
Re: [PATCH 6/6] vhost_net: remove the max pending check · "Michael S. Tsirkin" <mst@redhat.com> · 2013-08-25
Re: [PATCH 6/6] vhost_net: remove the max pending check · Jason Wang <jasowang@redhat.com> · 2013-08-26
Re: [PATCH 6/6] vhost_net: remove the max pending check · Jason Wang <jasowang@redhat.com> · 2013-08-30

Re: [PATCH 2/6] vhost_net: use vhost_add_used_and_signal_n() in vhost_zerocopy_signal_used()

From: "Michael S. Tsirkin" <mst@redhat.com>
Date: 2013-08-25 11:47:04
Also in: kvm, lkml, virtualization

On Fri, Aug 23, 2013 at 04:50:38PM +0800, Jason Wang wrote:

On 08/20/2013 10:33 AM, Jason Wang wrote:

quoted

On 08/16/2013 05:54 PM, Michael S. Tsirkin wrote:

quoted

On Fri, Aug 16, 2013 at 01:16:26PM +0800, Jason Wang wrote:

quoted

Switch to use vhost_add_used_and_signal_n() to avoid multiple calls to
vhost_add_used_and_signal(). With the patch we will call at most 2 times
(consider done_idx warp around) compared to N times w/o this patch.

Signed-off-by: Jason Wang <jasowang@redhat.com>

So? Does this help performance then?

Looks like it can especially when guest does support event index. When
guest enable tx interrupt, this can saves us some unnecessary signal to
guest. I will do some test.

Have done some test. I can see 2% - 3% increasing in both aggregate
transaction rate and per cpu transaction rate in TCP_RR and UDP_RR test.

I'm using ixgbe. W/o this patch, I can see more than 100 calls of
vhost_add_used_signal() in one vhost_zerocopy_signaled_used(). This is
because ixgbe (and other modern ethernet driver) tends to free old tx
skbs in a loop during tx interrupt, and vhost tend to batch the adding
used and signal in vhost_zerocopy_callback(). Switching to use
vhost_add_use_and_signal_n() means saving 100 times of used idx updating
and memory barriers.

Well it's only smp_wmb so a nop on most architectures, so
a 2% gain is surprising.
I'm guessing the cache miss on the write is what's
giving us a speedup here.

I'll review the code, thanks.


-- 
MST

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help