Re: [RFC PATCH v2 03/18] x86/asm/head: standardize the bottom of the stack for idle tasks
From: Josh Poimboeuf <hidden>
Date: 2016-04-29 23:27:51
Also in:
linuxppc-dev, lkml
On Fri, Apr 29, 2016 at 02:38:02PM -0700, Andy Lutomirski wrote:
On Fri, Apr 29, 2016 at 1:50 PM, Josh Poimboeuf [off-list ref] wrote:quoted
On Fri, Apr 29, 2016 at 12:39:16PM -0700, Andy Lutomirski wrote:quoted
On Thu, Apr 28, 2016 at 1:44 PM, Josh Poimboeuf [off-list ref] wrote:quoted
Thanks to all the recent x86 entry code refactoring, most tasks' kernel stacks start at the same offset right above their saved pt_regs, regardless of which syscall was used to enter the kernel. That creates a nice convention which makes it straightforward to identify the "bottom" of the stack, which can be useful for stack walking code which needs to verify the stack is sane. However there are still a few types of tasks which don't yet follow that convention: 1) CPU idle tasks, aka the "swapper" tasks 2) freshly forked TIF_FORK tasks which don't have a stack at all Make the idle tasks conform to the new stack bottom convention by starting their stack at a sizeof(pt_regs) offset from the end of the stack page. Signed-off-by: Josh Poimboeuf <redacted> --- arch/x86/kernel/head_64.S | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S index 6dbd2c0..0b12311 100644 --- a/arch/x86/kernel/head_64.S +++ b/arch/x86/kernel/head_64.S@@ -296,8 +296,9 @@ ENTRY(start_cpu) * REX.W + FF /5 JMP m16:64 Jump far, absolute indirect, * address given in m16:64. */ - movq initial_code(%rip),%rax - pushq $0 # fake return address to stop unwinder + call 1f # put return address on stack for unwinder +1: xorq %rbp, %rbp # clear frame pointer + movq initial_code(%rip), %rax pushq $__KERNEL_CS # set correct cs pushq %rax # target address in negative space lretq@@ -325,7 +326,7 @@ ENDPROC(start_cpu0) GLOBAL(initial_gs) .quad INIT_PER_CPU_VAR(irq_stack_union) GLOBAL(initial_stack) - .quad init_thread_union+THREAD_SIZE-8 + .quad init_thread_union + THREAD_SIZE - SIZEOF_PTREGSAs long as you're doing this, could you also set orig_ax to -1? I remember running into some oddities resulting from orig_ax containing garbage at some point.I assume you mean to initialize the orig_rax value in the pt_regs at the bottom of the stack of the idle task? How could that cause a problem? Since the idle task never returns from a system call, I'd assume that memory never gets accessed?Look at collect_syscall in lib/syscall.c
I don't see how collect_syscall() can be called for the per-cpu idle "swapper" tasks (which is what the above code affects). They don't have pids or /proc entries so you can't do /proc/<pid>/syscall on them. -- Josh