Thread (118 messages) 118 messages, 12 authors, 2016-06-23

Re: [RFC PATCH v2 05/18] sched: add task flag for preempt IRQ tracking

From: Andy Lutomirski <luto@amacapital.net>
Date: 2016-06-23 22:01:04
Also in: linux-s390, lkml

On Thu, Jun 23, 2016 at 1:40 PM, Josh Poimboeuf [off-list ref] wrote:
quoted hunk ↗ jump to hunk
On Thu, Jun 23, 2016 at 01:31:32PM -0500, Josh Poimboeuf wrote:
quoted
On Thu, Jun 23, 2016 at 09:35:29AM -0700, Andy Lutomirski wrote:
quoted
quoted
So which is the least-bad option?  To summarize:

  1) task flag(s) for preemption and page faults

  2) turn pt_regs into a stack frame

  3) annotate all calls from entry code in a table

  4) encode rbp on entry

They all have their issues, though I'm partial to #2.

Any more hare-brained ideas? :-)
I'll try to take a closer look at #2 and see just how much I dislike
all the stack frame munging.
Ok.
quoted
Also, in principle, it's only the
sleeping calls and the calls that make it into real (non-entry) kernel
code that really want to be unwindable through this mechanism.
Yeah, that's true.  We could modify options 2 or 3 to be less absolute.
Though I think that makes them more prone to future breakage.
quoted
FWIW, I don't care that much about preserving gdb's partial ability to
unwind through pt_regs, especially because gdb really ought to be able
to use DWARF, too.
Hm, that's a good point.  I really don't know if there are any other
external tools out there that would care.  Maybe we could try option 4
and then see if anybody complains.
I'm starting to think hare-brained option 4 is the way to go.  Any
external tooling should really be relying on DWARF anyway.

Here's a sneak preview.  If this general approach looks ok to you, I'll
go ahead and port all the in-tree unwinders and post a proper patch.

Instead of using xor -1 on the pt_regs pointer, I just cleared the
high-order bit.  That makes the unwinding experience much more pleasant
for a human stack walker, and also ensures that anybody trying to
dereference it gets slapped with an oops, at least in the 48-bit address
space era.
diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h
index 9a9e588..bf397426 100644
--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -201,6 +201,23 @@ For 32-bit we have the following conventions - kernel is built with
        .byte 0xf1
        .endm

+       /*
+        * This is a sneaky trick to help the unwinder find pt_regs on the
+        * stack.  The frame pointer is replaced with an encoded pointer to
+        * pt_regs.  The encoding is just a clearing of the highest-order bit,
+        * which makes it an invalid address and is also a signal to the
+        * unwinder that it's a pt_regs pointer in disguise.
+        *
+        * NOTE: This must be called *after* SAVE_EXTRA_REGS because it
+        * corrupts rbp.
+        */
+       .macro ENCODE_FRAME_POINTER ptregs_offset=0
+#ifdef CONFIG_FRAME_POINTER
+       leaq \ptregs_offset(%rsp), %rbp
+       btr $63, %rbp
+#endif
+       .endm
+
Maybe optimize slightly:

.ifeq \ptregs_offset
mov %rsp, %rbp
.else
leaq \ptregs_offset(%rsp), %rbp
.endif
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help