On 19/12/11 10:19, Amos Kong wrote:
On 12/12/11 13:12, Rusty Russell wrote:
quoted
On Mon, 12 Dec 2011 11:06:53 +0800, Amos Kong[off-list ref] wrote:
quoted
On 12/12/11 06:27, Benjamin Herrenschmidt wrote:
quoted
On Sun, 2011-12-11 at 14:25 +0200, Michael S. Tsirkin wrote:
quoted
Forwarding some results by Amos, who run multiple netperf streams in
parallel, from an external box to the guest. TCP_STREAM results were
noisy. This could be due to buffering done by TCP, where packet size
varies even as message size is constant.
TCP_RR results were consistent. In this benchmark, after switching
to mandatory barriers, CPU utilization increased by up to 35% while
throughput went down by up to 14%. the normalized throughput/cpu
regressed consistently, between 7 and 35%
The "fix" applied was simply this:
What machine& processor was this ?
pined guest memory to numa node 1
Please try this patch. How much does the branch cost us?
(Compiles, untested).
Thanks,
Rusty.
From: Rusty Russell<redacted>
Subject: virtio: harsher barriers for virtio-mmio.
We were cheating with our barriers; using the smp ones rather than the
real device ones. That was fine, until virtio-mmio came along, which
could be talking to a real device (a non-SMP CPU).
Unfortunately, just putting back the real barriers (reverting
d57ed95d) causes a performance regression on virtio-pci. In
particular, Amos reports netbench's TCP_RR over virtio_net CPU
utilization increased up to 35% while throughput went down by up to
14%.
By comparison, this branch costs us???
Reference: https://lkml.org/lkml/2011/12/11/22
Signed-off-by: Rusty Russell<redacted>
---
drivers/lguest/lguest_device.c | 10 ++++++----
drivers/s390/kvm/kvm_virtio.c | 2 +-
drivers/virtio/virtio_mmio.c | 7 ++++---
drivers/virtio/virtio_pci.c | 4 ++--
drivers/virtio/virtio_ring.c | 34 +++++++++++++++++++++-------------
include/linux/virtio_ring.h | 1 +
tools/virtio/linux/virtio.h | 1 +
tools/virtio/virtio_test.c | 3 ++-
8 files changed, 38 insertions(+), 24 deletions(-)
Hi all,
I tested with the same environment and scenarios.
tested one scenarios for three times and compute the average for more
precision.
Thanks, Amos
--------- compare results -----------
Mon Dec 19 09:51:09 2011
1 - avg-old.netperf.exhost_guest.txt
2 - avg-fixed.netperf.exhost_guest.txt
======
TCP_STREAM
sessions| size|throughput| cpu| normalize| #tx-pkts| #rx-pkts| #tx-byts|
#rx-byts| #re-trans| #tx-intr| #rx-intr| #io_exit| #irq_inj|#tpkt/#exit|
#rpkt/#irq
1 1| 64| 1073.54| 10.50| 102| 0| 31| 0| 1612| 0| 16| 487641| 489753|
504764| 0.00| 0.00
2 1| 64| 1079.44| 10.29| 104| 0| 30| 0| 1594| 0| 17| 487156| 488828|
504411| 0.00| 0.00
% | 0.0| +0.5| -2.0| +2.0| 0| -3.2| 0| -1.1| 0| +6.2| -0.1| -0.2| -0.1|
The format is broken in webpage, attached the result file.
it's also available here: http://amosk.info/download/rusty-fix-perf.txt