Thread (18 messages) 18 messages, 5 authors, 2014-09-04

[PATCH v5 1/3] ARM: probes: check stack operation when decoding

From: Jon Medhurst Tixy <hidden>
Date: 2014-09-01 17:29:48
Also in: lkml

On Sat, 2014-08-30 at 09:28 +0800, Wang Nan wrote:
On 2014/8/29 16:47, Jon Medhurst (Tixy) wrote:
quoted
On Thu, 2014-08-28 at 11:24 +0100, Will Deacon wrote:
quoted
On Thu, Aug 28, 2014 at 11:20:21AM +0100, Russell King - ARM Linux wrote:
quoted
On Thu, Aug 28, 2014 at 06:51:15PM +0900, Masami Hiramatsu wrote:
quoted
(2014/08/27 22:02), Wang Nan wrote:
quoted
This patch improves arm instruction decoder, allows it check whether an
instruction is a stack store operation. This information is important
for kprobe optimization.

For normal str instruction, this patch add a series of _SP_STACK
register indicator in the decoder to test the base and offset register
in ldr <Rt>, [<Rn>, <Rm>] against sp.

For stm instruction, it check sp register in instruction specific
decoder.
OK, reviewed. but since I'm not so sure about arm32 ISA,
I need help from ARM32 maintainer to ack this.
What you actually need is an ack from the ARM kprobes people who
understand this code.  That would be much more meaningful than my
ack.  They're already on the Cc list.
Tixy, can you take a look please?
I'll take an in depth look on Monday as I'm currently on holiday, so for
now just some brief and possibly not well thought out comments...

- If the intent is to not optimise stack push operations, then this
actually excludes the main use of kprobes which I believe is to insert
probes at the start of functions (there's even a specific jprobes API
for that) this is because functions usually start by saving registers on
the stack.
Agree. If the decoder can bring up more information, kprobeopt can dynamically
compute the range of stack an instruction require, then adjust stack protection range.
This need ARM decoder bring up more information. For example: for a "push {r4, r5}"
instruction, decoder should report it is a stack store operation, require 8 bytes
of stack, then when composing trampoline code, we can put registers at
[sp, #-8]. Only instructions such as "str r0, [sp, r1]" should be prevented.

However, this need more improvement on decoder: all store operations should use
a special decorer then. What do you think?
This doesn't work for the non-optimised kprobes case because, when a
probe is hit, we couldn't know what stack addresses to reserve until
we're several calls deep in the exception handler and possibly already
using those addresses. Anyway, perhaps we don't need to worry about
these instructions after all, more below...
quoted
- Crowbarring in special case testing for stack operations looks a bit
inelegant and not a sustainable way of doing this, what about the next
special case we need? However, stack push operations _are_ a general
special cases for instruction emulation so perhaps that's OK, and leads
me to...

- The current 'unoptimised' kprobes implementation allows for pushing on
the stack (see __und_svc and the unused (?) jprobe_return) but this is
just aimed at stm instructions, not things like "str r0, [sp, -imm]!"
that might be used to simultaneously save a register and reserve an
arbitrary amount of stack space. Probing such instructions could lead to
the kprobes code trashing the kernel stack.
By a quick search I just find tow instructions matching "str.*\[sp,[^\]]*-[^4]",
one in Ldiv0_64, another in Ldiv0, both are "str     lr, [sp, #-8]!". So I think
such instructions are very special.
Yes, I built a multi_v7_defconfig kernel with GCC 4.9 and I too could
only find those occurrences of the problematic instructions, which come
human written assembler, so we probably aren't restricting any kprobes
users if we don't support probing of those types of str instructions.

That would just leave us to support stm instructions which push
registers onto the stack, and the optimised kprobes could take the same
approach as the unoptimised ones and just unconditionally reserve 64
bytes of stack on every probe (see __und_svc in entry-armv.S).
Furthermore, I thought "unoptimised" kprobe use another stack, could you please
explain how such probing trashing normal kernel stack?
No, unoptimised probes doesn't use another stack, they use the stack of
the current kernel task.

-- 
Tixy
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help