Thread (25 messages) 25 messages, 5 authors, 2018-05-14

[PATCH 2/2] arm64: Clear the stack

From: Alexander Popov <hidden>
Date: 2018-05-14 13:53:19
Also in: lkml

Possibly related (same subject, not in this thread)

On 14.05.2018 13:06, Mark Rutland wrote:
On Mon, May 14, 2018 at 12:35:25PM +0300, Alexander Popov wrote:
quoted
On 14.05.2018 08:15, Mark Rutland wrote:
quoted
On Sun, May 13, 2018 at 11:40:07AM +0300, Alexander Popov wrote:
quoted
So what would you think if I do the following in check_alloca():

	if (size >= stack_left) {
#if !defined(CONFIG_VMAP_STACK) && defined(CONFIG_SCHED_STACK_END_CHECK)
		panic("alloca over the kernel stack boundary\n");
#else
		BUG();
#endif
Given this is already out-of-line, how about we always use panic(), regardless
of VMAP_STACK and SCHED_STACK_END_CHECK? i.e. just

	if (unlikely(size >= stack_left))
		panic("alloca over the kernel stack boundary");

If we have VMAP_STACK selected, and overflow during the panic, it's the same as
if we overflowed during the BUG(). It's likely that panic() will use less stack
space than BUG(), and the compiler can put the call in a slow path that
shouldn't affect most calls, so in all cases it's likely preferable.
I'm sure that maintainers and Linus will strongly dislike my patch if I always
use panic() here. panic() kills the whole kernel and we shouldn't use it when we
can safely continue to work.

Let me describe my logic. So let's have size >= stack_left on a thread stack.

1. If CONFIG_VMAP_STACK is enabled, we can safely use BUG(). Even if BUG()
handling overflows the thread stack into the guard page, handle_stack_overflow()
is called and the neighbour memory is not corrupted. The kernel can proceed to live.
On arm64 with CONFIG_VMAP_STACK, a stack overflow will result in a
panic(). My understanding was that the same is true on x86.
No, x86 CONFIG_VMAP_STACK only kills the offending process. I see it on my deep
recursion test, the kernel continues to live. handle_stack_overflow() in
arch/x86/kernel/traps.c calls die().
quoted
2. If CONFIG_VMAP_STACK is disabled, BUG() handling can corrupt the neighbour
kernel memory and cause the undefined behaviour of the whole kernel. I see it on
my lkdtm test. That is a cogent reason for panic().
In this case, panic() can also corrupt the neighbour stack, and could
also fail.

When CONFIG_VMAP_STACK is not selected, a stack overflow simply cannot
be handled reliably -- while panic() may be more likely to succeed, it
is not gauranteed to.
quoted
2.a. If CONFIG_SCHED_STACK_END_CHECK is enabled, the kernel already does panic()
when STACK_END_MAGIC is corrupted. So we will _not_ break the safety policy if
we do panic() in a similar situation in check_alloca().
Sure, I'm certainly happy with panic() here.
Ok!
quoted
2.b. If CONFIG_SCHED_STACK_END_CHECK is disabled, the user has some real reasons
not to do panic() when the kernel stack is corrupted. 
I believe that CONFIG_SCHED_STACK_END_CHECK is seen as a debug feature,
and hence people don't select it. 
I see CONFIG_SCHED_STACK_END_CHECK enabled by default in Ubuntu config...
I strongly doubt that people have
reasons to disable it other than not wanting the overhead associated
with debug features.
I think it's not a question of performance here. There are cases when a system
must live as long as possible (even partially corrupted) and must not die
entirely. Oops is ok for those systems, but panic (full DoS) is not.
I think it is reasonable to panic() here even with CONFIG_VMAP_STACK
selected.
It's too tough for CONFIG_VMAP_STACK on x86 - the system can proceed to live.
Anyway, the check_alloca() code will not be shared between x86 and arm64, I've
described the reasons in this thread. So I can have BUG() for CONFIG_VMAP_STACK
on x86 and Laura can consistently use panic() on arm64.
quoted
So we should not do it in check_alloca() as well, just use BUG() and
hope for the best.
Regardless of whether we BUG() or panic(), we're hoping for the best.

Consistently using panic() here will keep things simpler, so any failure
reported will be easier to reason about, and easier to debug.
Let me keep BUG() for !CONFIG_SCHED_STACK_END_CHECK. I beware of using panic()
by default, let distro/user decide this. I remember very well how I was shouted
at, when this one was merged:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ce6fa91b93630396ca220c33dd38ffc62686d499


Mark, I'm really grateful to you for such a nice code review!
Alexander
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help