__get_user slower than get_user (was Re: [RFC PATCH V3 0/5] Hi:)
From: "Michael S. Tsirkin" <mst@redhat.com>
Date: 2019-01-09 04:31:19
Also in:
kvm, lkml, virtualization
On Mon, Jan 07, 2019 at 02:44:24PM -0800, Dan Williams wrote:
On Mon, Jan 7, 2019 at 2:25 PM Michael S. Tsirkin [off-list ref] wrote:quoted
On Mon, Jan 07, 2019 at 01:39:15PM -0800, Dan Williams wrote:quoted
On Mon, Jan 7, 2019 at 6:11 AM Michael S. Tsirkin [off-list ref] wrote:quoted
On Sun, Jan 06, 2019 at 11:15:20PM -0800, Dan Williams wrote:quoted
On Sun, Jan 6, 2019 at 8:17 PM Michael S. Tsirkin [off-list ref] wrote:quoted
On Mon, Jan 07, 2019 at 11:53:41AM +0800, Jason Wang wrote:quoted
On 2019/1/7 上午11:28, Michael S. Tsirkin wrote:quoted
On Mon, Jan 07, 2019 at 10:19:03AM +0800, Jason Wang wrote:quoted
On 2019/1/3 上午4:47, Michael S. Tsirkin wrote:quoted
On Sat, Dec 29, 2018 at 08:46:51PM +0800, Jason Wang wrote:quoted
This series tries to access virtqueue metadata through kernel virtual address instead of copy_user() friends since they had too much overheads like checks, spec barriers or even hardware feature toggling.Will review, thanks! One questions that comes to mind is whether it's all about bypassing stac/clac. Could you please include a performance comparison with nosmap?On machine without SMAP (Sandy Bridge): Before: 4.8Mpps After: 5.2MppsOK so would you say it's really unsafe versus safe accesses? Or would you say it's just a better written code?It's the effect of removing speculation barrier.You mean __uaccess_begin_nospec introduced by commit 304ec1b050310548db33063e567123fae8fd0301 ? So fundamentally we do access_ok checks when supplying the memory table to the kernel thread, and we should do the spec barrier there. Then we can just create and use a variant of uaccess macros that does not include the barrier? Or, how about moving the barrier into access_ok? This way repeated accesses with a single access_ok get a bit faster. CC Dan Williams on this idea.It would be interesting to see how expensive re-doing the address limit check is compared to the speculation barrier. I.e. just switch vhost_get_user() to use get_user() rather than __get_user(). That will sanitize the pointer in the speculative path without a barrier.Hmm it's way cheaper even though IIRC it's measureable. Jason, would you like to try? Although frankly __get_user being slower than get_user feels very wrong. Not yet sure what to do exactly but would you agree?Agree. __get_user() being faster than get_user() defeats the whole point of converting code paths to the access_ok() + __get_user() pattern.Did you mean the reverse?Hmm, no... I'll rephrase: __get_user() should have lower overhead than get_user().
Right ... Linus, given that you just changed all users of access_ok anyway, do you still think that the access_ok() conversion to return a speculation sanitized pointer or NULL is too big a conversion? It was previously discarded here: https://lkml.org/lkml/2018/1/17/929 but at that point we didn't have numbers and there was an understandable rush to ship something safe. At this point I think that vhost can show very measureable gains from this conversion. Thanks, -- MST _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization