RE: [PATCH v3 2/6] x86/uaccess: Avoid barrier_nospec() in 64-bit __get_user()
From: David Laight <hidden>
Date: 2024-11-22 09:39:02
Also in:
lkml
From: Linus Torvalds
Sent: 21 November 2024 22:16 On Thu, 21 Nov 2024 at 13:40, Josh Poimboeuf [off-list ref] wrote:quoted
The profile is showing futex_get_value_locked():Ahh.quoted
That has several callers, so we can probably just use get_user() there?Yeah, that's the simplest thing. That thing isn't even some inline function, so the real cost is the call. That said, exactly because it's not inlined, and calls are expensive, and this is apparently really critical, we can just do it with the full "unsafe_get_user()" model. It's not so complicated. The attached patch is untested, but I did check that it generates almost perfect code: mov %gs:0x0,%rax # current incl 0x1a9c(%rax) # current->pagefault_disable++ movabs $0x123456789abcdef,%rcx # magic virtual address size cmp %rsi,%rcx # address masking sbb %rcx,%rcx or %rsi,%rcx stac # enable user space acccess mov (%rcx),%ecx # get the value clac # disable user space access decl 0x1a9c(%rax) # current->pagefault_disable-- mov %ecx,(%rdi) # save the value xor %eax,%eax # return 0 ret (with the error case for the page fault all out-of-line).
I presume you are assuming an earlier access_ok() call? IIRC all x86 from 286 onwards fault accesses that cross the ~0 to 0 boundary. So you don't need to have called access_ok() prior to the above for non-byte accesses. Even for byte accesses (and with ~0 being a valid address) do the address mask before pagefault_disable++ and the error test is 'jc label' after the 'cmp'. Don't you also get better code for an 'asm output with goto' form? While it requires the caller have a 'label: return -EFAULT;' somewhere that is quite common in kernel code.
So this should be _faster_ than the old __get_user(), because while the address masking is not needed, it's cheaper than the function call used to be and the error handling is better.
Perhaps get_user() should be the function call and __get_user() inlined. Both including address validation but different calling conventions? (After fixing the code that doesn't want address masking.) ...
Now, we could possibly say "just remove the fence in __get_user() entirely", but that would involve moving it to access_ok().
How valid would it be to assume an explicit access_ok() was far enough away from the get_user() that you couldn't speculate all the way to something that used the read value to perform another kernel access?
And then it wouldn't actually speed anything up (except the horrid architecture-specific kernel uses that then don't call access_ok() at all - and we don't care about *those*).
Find and kill :-) David
Linus- Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)