Thread (33 messages) 33 messages, 5 authors, 2019-01-11

Re: [RFC PATCH V3 1/5] vhost: generalize adding used elem

From: Jason Wang <jasowang@redhat.com>
Date: 2019-01-07 07:00:29
Also in: kvm, lkml

On 2019/1/5 上午8:33, Sean Christopherson wrote:
On Fri, Jan 04, 2019 at 04:29:34PM -0500, Michael S. Tsirkin wrote:
quoted
On Sat, Dec 29, 2018 at 08:46:52PM +0800, Jason Wang wrote:
quoted
Use one generic vhost_copy_to_user() instead of two dedicated
accessor. This will simplify the conversion to fine grain
accessors. About 2% improvement of PPS were seen during vitio-user
txonly test.

Signed-off-by: Jason Wang <jasowang@redhat.com>
I don't hve a problem with this patch but do you have
any idea how come removing what's supposed to be
an optimization speeds things up?
With SMAP, the 2x vhost_put_user() will also mean an extra STAC/CLAC pair,
which is probably slower than the overhead of CALL+RET to whatever flavor
of copy_user_generic() gets used.  CALL+RET is really the only overhead
since all variants of copy_user_generic() unroll accesses smaller than
64 bytes, e.g. on a 64-bit system, __copy_to_user() will write all 8
bytes in a single MOV.

Removing the special casing also eliminates a few hundred bytes of code
as well as the need for hardware to predict count==1 vs. count>1.
Yes, I don't measure, but STAC/CALC is pretty expensive when we are do 
very small copies based on the result of nosmap PPS.

Thanks
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help