Thread (3 messages) 3 messages, 2 authors, 2018-03-19
STALE2992d

[PATCH 3.16 56/76] x86/syscall: Sanitize syscall table de-references under speculation

From: Ben Hutchings <hidden>
Date: 2018-03-12 03:06:12
Also in: lkml, stable

3.16.56-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Ben Hutchings <redacted>

commit 2fbd7af5af8665d18bcefae3e9700be07e22b681 upstream.

The upstream version of this, touching C code, was written by Dan Williams,
with the following description:
The syscall table base is a user controlled function pointer in kernel
space. Use array_index_nospec() to prevent any out of bounds speculation.

While retpoline prevents speculating into a userspace directed target it
does not stop the pointer de-reference, the concern is leaking memory
relative to the syscall table base, by observing instruction cache
behavior.
The x86_64 assembly version for 4.4 was written by Jiri Slaby, with
the following description:
In 4.4.118, we have commit c8961332d6da (x86/syscall: Sanitize syscall
table de-references under speculation), which is a backport of upstream
commit 2fbd7af5af86. But it fixed only the C part of the upstream patch
-- the IA32 sysentry. So it ommitted completely the assembly part -- the
64bit sysentry.

Fix that in this patch by explicit array_index_mask_nospec written in
assembly. The same was used in lib/getuser.S.

However, to have "sbb" working properly, we have to switch from "cmp"
against (NR_syscalls-1) to (NR_syscalls), otherwise the last syscall
number would be "and"ed by 0. It is because the original "ja" relies on
"CF" or "ZF", but we rely only on "CF" in "sbb". That means: switch to
"jae" conditional jump too.

Final note: use rcx for mask as this is exactly what is overwritten by
the 4th syscall argument (r10) right after.
In 3.16 the x86_32 syscall table lookup is also written in assembly.
So I've taken Jiri's version and added similar masking in entry_32.S,
using edx as the temporary.  edx is clobbered by SAVE_REGS and seems
to be free at this point.

Cc: Dan Williams <redacted>
Cc: Jiri Slaby <redacted>
Cc: Jan Beulich <redacted>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <redacted>
Cc: linux-arch@vger.kernel.org
Cc: kernel-hardening@lists.openwall.com
Cc: gregkh@linuxfoundation.org
Cc: Andy Lutomirski <luto@kernel.org>
Cc: alan@linux.intel.com
Cc: Jinpu Wang <redacted>
Signed-off-by: Ben Hutchings <redacted>
---
--- a/arch/x86/kernel/entry_32.S
+++ b/arch/x86/kernel/entry_32.S
@@ -426,6 +426,8 @@ sysenter_past_esp:
 sysenter_do_call:
 	cmpl $(NR_syscalls), %eax
 	jae sysenter_badsys
+	sbb %edx, %edx				/* array_index_mask_nospec() */
+	and %edx, %eax
 	call *sys_call_table(,%eax,4)
 sysenter_after_call:
 	movl %eax,PT_EAX(%esp)
@@ -503,6 +505,8 @@ ENTRY(system_call)
 	cmpl $(NR_syscalls), %eax
 	jae syscall_badsys
 syscall_call:
+	sbb %edx, %edx				/* array_index_mask_nospec() */
+	and %edx, %eax
 	call *sys_call_table(,%eax,4)
 syscall_after_call:
 	movl %eax,PT_EAX(%esp)		# store the return value
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -445,12 +445,14 @@ GLOBAL(system_call_after_swapgs)
 	jnz tracesys
 system_call_fastpath:
 #if __SYSCALL_MASK == ~0
-	cmpq $__NR_syscall_max,%rax
+	cmpq	$NR_syscalls, %rax
 #else
 	andl $__SYSCALL_MASK,%eax
-	cmpl $__NR_syscall_max,%eax
+	cmpl	$NR_syscalls, %eax
 #endif
-	ja badsys
+	jae	badsys
+	sbb	%rcx, %rcx			/* array_index_mask_nospec() */
+	and	%rcx, %rax
 	movq %r10,%rcx
 #ifdef CONFIG_RETPOLINE
 	movq	sys_call_table(, %rax, 8), %rax
@@ -577,12 +579,14 @@ tracesys:
 	LOAD_ARGS ARGOFFSET, 1
 	RESTORE_REST
 #if __SYSCALL_MASK == ~0
-	cmpq $__NR_syscall_max,%rax
+	cmpq	$NR_syscalls, %rax
 #else
 	andl $__SYSCALL_MASK,%eax
-	cmpl $__NR_syscall_max,%eax
+	cmpl	$NR_syscalls, %eax
 #endif
-	ja   int_ret_from_sys_call	/* RAX(%rsp) set to -ENOSYS above */
+	jae	int_ret_from_sys_call		/* RAX(%rsp) set to -ENOSYS above */
+	sbb	%rcx, %rcx			/* array_index_mask_nospec() */
+	and	%rcx, %rax
 	movq %r10,%rcx	/* fixup for C */
 #ifdef CONFIG_RETPOLINE
 	movq	sys_call_table(, %rax, 8), %rax
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help