Re: [RFC PATCH V3 0/5] Hi:
From: Jason Wang <jasowang@redhat.com>
Date: 2019-01-08 10:02:00
Also in:
kvm, lkml, virtualization
On 2019/1/7 下午10:37, Michael S. Tsirkin wrote:
On Mon, Jan 07, 2019 at 02:50:17PM +0800, Jason Wang wrote:quoted
On 2019/1/7 下午12:17, Michael S. Tsirkin wrote:quoted
On Mon, Jan 07, 2019 at 11:53:41AM +0800, Jason Wang wrote:quoted
On 2019/1/7 上午11:28, Michael S. Tsirkin wrote:quoted
On Mon, Jan 07, 2019 at 10:19:03AM +0800, Jason Wang wrote:quoted
On 2019/1/3 上午4:47, Michael S. Tsirkin wrote:quoted
On Sat, Dec 29, 2018 at 08:46:51PM +0800, Jason Wang wrote:quoted
This series tries to access virtqueue metadata through kernel virtual address instead of copy_user() friends since they had too much overheads like checks, spec barriers or even hardware feature toggling.Will review, thanks! One questions that comes to mind is whether it's all about bypassing stac/clac. Could you please include a performance comparison with nosmap?On machine without SMAP (Sandy Bridge): Before: 4.8Mpps After: 5.2MppsOK so would you say it's really unsafe versus safe accesses? Or would you say it's just a better written code?It's the effect of removing speculation barrier.You mean __uaccess_begin_nospec introduced by commit 304ec1b050310548db33063e567123fae8fd0301 ?Yes.quoted
So fundamentally we do access_ok checks when supplying the memory table to the kernel thread, and we should do the spec barrier there. Then we can just create and use a variant of uaccess macros that does not include the barrier?The unsafe ones?Fundamentally yes.quoted
quoted
Or, how about moving the barrier into access_ok? This way repeated accesses with a single access_ok get a bit faster. CC Dan Williams on this idea.The problem is, e.g for vhost control path. During mem table validation, we don't even want to access them there. So the spec barrier is not needed.Again spec barrier is not needed as such at all. It's defence in depth. And mem table init is slow path. So we can stick a barrier there and it won't be a problem for anyone.
Consider it's a generic helper. For a deep defense we should keep it around the places we do the real userspace memory access.
quoted
quoted
quoted
quoted
quoted
On machine with SMAP (Broadwell): Before: 5.0Mpps After: 6.1Mpps No smap: 7.5Mpps Thanksno smap being before or after?Let me clarify: Before (SMAP on): 5.0Mpps Before (SMAP off): 7.5Mpps After (SMAP on): 6.1Mpps ThanksHow about after + smap off?After (SMAP off): 8.0Mppsquoted
And maybe we want a module option just for the vhost thread to keep smap off generally since almost all it does is copy stuff from userspace into kernel anyway. Because what above numbers should is that we really really want a solution that isn't limited to just meta-data access, and I really do not see how any such solution can not also be used to make meta-data access fast.As we've discussed in another thread of previous version. This requires lots of changes, the main issues is SMAP state was not saved/restored on explicit schedule().I wonder how expensive can reading eflags be? If it's cheap we can just check EFLAGS.AC and rerun stac if needed.
Probably not expensive, but consider vhost is probably the only user, is it really worth to do this? If we do vmap + batched copy, most part of the code were still under protection of SMAP but the performance is almost the same. Isn't this a much better solution?
quoted
Even if it did, since vhost will call lots of net/block codes, any kind of uaccess in those codes needs understand this special request from vhost e.g you provably need to invent a new kinds of iov iterator that does not touch SMAP at all. And I'm not sure this is the only thing we need to deal with.Well we wanted to move packet processing from tun into vhost anyway right?
Yes, but how about other devices? And we should deal with zerocopy path. It not a small amount of refactoring and work.
quoted
So I still prefer to: 1) speedup the metadata access through vmap + MMU notifier 2) speedup the datacopy with batched copy (unsafe ones or other new interfaces) ThanksI just guess once you do (2) you will want to rework (1) to use the new interfaces.
Do you mean batching? So batched copy is much more easier, just few codes if unsafe variants if ready or we can invent new safe variants. But it would still be slower than vmap. And what's more, vmap does not conflict with batching.
So all the effort you are now investing in (1) will be wasted. Just my $.02.
Speeding up metadata access is much easier and vmap was the fastest method. So we can benefit from it soon. Speeding up data copy requires much more work to do. And in the future if kernel or vhost is ready for some new API and perf numbers prove its advantage, it doesn't harm to switch. Thanks _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization