Thread (141 messages) 141 messages, 7 authors, 2016-10-26

Re: [PATCH] optimize vhost enqueue

From: Wang, Zhihong <hidden>
Date: 2016-08-17 10:07:05

-----Original Message-----
From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com]
Sent: Wednesday, August 17, 2016 5:18 PM
To: Wang, Zhihong <redacted>; Yuanhan Liu
[off-list ref]
Cc: dev@dpdk.org
Subject: Re: [dpdk-dev] [PATCH] optimize vhost enqueue



On 08/17/2016 08:41 AM, Wang, Zhihong wrote:
quoted
quoted
-----Original Message-----
From: Yuanhan Liu [mailto:yuanhan.liu@linux.intel.com]
Sent: Wednesday, August 17, 2016 10:38 AM
To: Wang, Zhihong <redacted>
Cc: Maxime Coquelin <redacted>; dev@dpdk.org
Subject: Re: [dpdk-dev] [PATCH] optimize vhost enqueue

On Wed, Aug 17, 2016 at 01:45:26AM +0000, Wang, Zhihong wrote:
quoted
quoted
-----Original Message-----
From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com]
Sent: Tuesday, August 16, 2016 10:00 PM
To: Wang, Zhihong <redacted>; dev@dpdk.org
Subject: Re: [dpdk-dev] [PATCH] optimize vhost enqueue

Hi Zhihong,

On 08/16/2016 05:50 AM, Zhihong Wang wrote:
quoted
This patch optimizes the vhost enqueue function:
rte_vhost_enqueue_burst.
quoted
quoted
quoted
Currently there're 2 callbacks for vhost enqueue:
 *  virtio_dev_merge_rx for mrg_rxbuf turned on cases.
 *  virtio_dev_rx for mrg_rxbuf turned off cases.

The virtio_dev_merge_rx doesn't provide optimal performance, also it is
reported having compatibility issue working with Windows VMs.
Could you tell us more please about this compatibility issue?

For example, when you have testpmd in the host and Window VM as the
guest,
quoted
with mrg_rxbuf turned on, the guest will hang once there's packets
enqueued
quoted
quoted
quoted
by virtio_dev_merge_rx.
You should put it into commit log.

Okay.

quoted
quoted
Let me know if you see the same issue.

quoted
quoted
Besides, having 2 separated functions increases maintenance efforts.

This patch uses a single function logic to replace the current 2 for
better maintainability, and provides better performance by optimizing
caching behavior especially for mrg_rxbuf turned on cases.
Here, here sounds two parts to me:

- one to unite mergeable and non-mergeable Rx

- another one to optimize the mergeable path

That means you should do it in two patches, with that we can have clear
understanding what changes the performance boost. It also helps review.

Please see explanation below.

quoted
quoted
quoted
Do you have some benchmark comparison before and after your change?

Also, for maintainability, I would suggest the that the enqueue
function be split. Because vhost_enqueue_burst becomes very long (220
LoC), and max level of indentation is too high (6).

It makes the code hard to understand, and prone to miss bugs during
review and maintenance.
Agreed.
quoted
This is something I've thought about while writing the code, the reason I
keep it as one function body is that:

 1. This function is very performance sensitive, and we need full control of
    code ordering (You can compare with the current performance with the
    mrg_rxbuf feature turned on to see the difference).
Will inline functions help?

Optimization in this patch actually reorganizes the code from its logic,
so it's not suitable for making separated functions.

I'll explain this in v2.
I agree with Yuanhan.
Inline functions should not break the optimizations.
IMHO, this is mandatory for the patch to be accepted.

Excellent!

quoted
quoted
quoted
 2. I somehow find that a single function logic makes it easier to understand,
    surely I can add comments to make it easiler to read for .

Please let me know if you still insist, we can discuss more on it.
I am personally not a fan of huge function; I would try hard to avoid
too many levels of indentation as well.
quoted
quoted
quoted
It also fixes the issue working with Windows VMs.
Ideally, the fix should be sent separately, before the rework.
Indeed, we might want to have the fix in the stable branch, without
picking the optimization.
Agreed.

The fact is that I don't have much time to debug with the current code
since it's messy and I don't have Windows virtio code and the debugging
environment.
It seems you are not the only one facing the issue:
https://github.com/YanVugenfirer/kvm-guest-drivers-windows/issues/70

So a dedicated fix is really important.

Yeah that's me raising this issue there.

But I think it's another standalone task to identify the root cause and
find the fix for the existing code.

quoted
This patch doesn't try to fix this issue, it rewrites the logic totally,
and somehow fixes this issue.

Do you think integrating this whole patch into the stable branch will work?
Personally I think it makes more sense.
No.
We don't even know why/how it fixes the Windows issue, which would be
the first thing to understand before integrating a fix in stable branch.

And the stable branch is not meant for integrating such big reworks,
it is only meant to fix bugs.

The risk of regressions have to be avoided as much as possible.
quoted
quoted
quoted
quoted
quoted
Signed-off-by: Zhihong Wang <redacted>
---
 lib/librte_vhost/vhost-net.h  |   6 +-
 lib/librte_vhost/vhost_rxtx.c | 582
++++++++++++++----------------------------
quoted
quoted
quoted
 lib/librte_vhost/virtio-net.c |  15 +-
 3 files changed, 208 insertions(+), 395 deletions(-)
582 lines changed is a huge patch.
If possible, it would be better splitting it in incremental changes,
making the review process easier.

It looks like a huge patch, but it simply deletes the current implementation
and add the new code. I think perhaps split it into 2, 1st one to replace
just the rte_vhost_enqueue_burst, 2nd one to delete all the obsolete
functions.
quoted
It should make the patch clear, how do you think?  :)
Nope, it's not working in that way. It should be:

- one patch to fix the hang issue for windows guest

  Please cc it to stable@dpdk.org as well so that we could pick it for
  v16.07 stable release.

- one patch to unite the two different Rx code path

- another patch to optimize mergeable code path

I can separate optimization from the basic code in v2, however as I explained
this patch is built from scratch and doesn't take anything from the existing
code, so there's no way to transform from the existing code incrementally into
the new code.

quoted
	--yliu
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help