Thread (118 messages) 118 messages, 12 authors, 2016-06-23

Re: [RFC PATCH v2 03/18] x86/asm/head: standardize the bottom of the stack for idle tasks

From: Josh Poimboeuf <hidden>
Date: 2016-04-29 23:27:51
Also in: linuxppc-dev, lkml

On Fri, Apr 29, 2016 at 02:38:02PM -0700, Andy Lutomirski wrote:
On Fri, Apr 29, 2016 at 1:50 PM, Josh Poimboeuf [off-list ref] wrote:
quoted
On Fri, Apr 29, 2016 at 12:39:16PM -0700, Andy Lutomirski wrote:
quoted
On Thu, Apr 28, 2016 at 1:44 PM, Josh Poimboeuf [off-list ref] wrote:
quoted
Thanks to all the recent x86 entry code refactoring, most tasks' kernel
stacks start at the same offset right above their saved pt_regs,
regardless of which syscall was used to enter the kernel.  That creates
a nice convention which makes it straightforward to identify the
"bottom" of the stack, which can be useful for stack walking code which
needs to verify the stack is sane.

However there are still a few types of tasks which don't yet follow that
convention:

1) CPU idle tasks, aka the "swapper" tasks

2) freshly forked TIF_FORK tasks which don't have a stack at all

Make the idle tasks conform to the new stack bottom convention by
starting their stack at a sizeof(pt_regs) offset from the end of the
stack page.

Signed-off-by: Josh Poimboeuf <redacted>
---
 arch/x86/kernel/head_64.S | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 6dbd2c0..0b12311 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -296,8 +296,9 @@ ENTRY(start_cpu)
         *      REX.W + FF /5 JMP m16:64 Jump far, absolute indirect,
         *              address given in m16:64.
         */
-       movq    initial_code(%rip),%rax
-       pushq   $0              # fake return address to stop unwinder
+       call    1f              # put return address on stack for unwinder
+1:     xorq    %rbp, %rbp      # clear frame pointer
+       movq    initial_code(%rip), %rax
        pushq   $__KERNEL_CS    # set correct cs
        pushq   %rax            # target address in negative space
        lretq
@@ -325,7 +326,7 @@ ENDPROC(start_cpu0)
        GLOBAL(initial_gs)
        .quad   INIT_PER_CPU_VAR(irq_stack_union)
        GLOBAL(initial_stack)
-       .quad  init_thread_union+THREAD_SIZE-8
+       .quad  init_thread_union + THREAD_SIZE - SIZEOF_PTREGS
As long as you're doing this, could you also set orig_ax to -1?  I
remember running into some oddities resulting from orig_ax containing
garbage at some point.
I assume you mean to initialize the orig_rax value in the pt_regs at the
bottom of the stack of the idle task?

How could that cause a problem?  Since the idle task never returns from
a system call, I'd assume that memory never gets accessed?
Look at collect_syscall in lib/syscall.c
I don't see how collect_syscall() can be called for the per-cpu idle
"swapper" tasks (which is what the above code affects).  They don't have
pids or /proc entries so you can't do /proc/<pid>/syscall on them.

-- 
Josh
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help