Re: [PATCH v2 03/11] tty: kbd: reduce stack size with KASAN
From: Arnd Bergmann <arnd@arndb.de>
Date: 2017-06-16 20:56:56
Also in:
lkml
Subsystem:
the rest, tty layer and serial drivers · Maintainers:
Linus Torvalds, Greg Kroah-Hartman, Jiri Slaby
On Fri, Jun 16, 2017 at 7:29 PM, Dmitry Torokhov [off-list ref] wrote:
On Fri, Jun 16, 2017 at 8:58 AM, Samuel Thibault [off-list ref] wrote:quoted
Arnd Bergmann, on ven. 16 juin 2017 17:41:47 +0200, wrote:quoted
The problem are the 'ch' and 'flag' variables that are passed into tty_insert_flip_char by value, and from there into tty_insert_flip_string_flags by reference. In this case, kasan tries to detect whether tty_insert_flip_string_flags() does any out-of-bounds access on the pointers and adds 64 bytes redzone around each of the two variables.Ouch.quoted
gcc-6.3.1 happens to inline 16 calls of tty_insert_flip_char() intoI wonder if we should stop marking tty_insert_flip_char() as inline.
That would be an easy solution, yes. tty_insert_flip_char() was
apparently meant to be optimized for the fast path to completely
avoid calling into another function, but that fast path got a bit more
complex with commit acc0f67f307f ("tty: Halve flip buffer
GFP_ATOMIC memory consumption").
If we move it out of line, the fast path optimization goes away and
we could just have a simple implementation like
int tty_insert_flip_char(struct tty_port *port, unsigned char ch, char flag)
{
struct tty_buffer *tb = port->buf.tail;
int flags = (flag == TTY_NORMAL) ? TTYB_NORMAL : 0;
if (!__tty_buffer_request_room(port, 1, flags))
return 0;
if (~tb->flags & TTYB_NORMAL)
*flag_buf_ptr(tb, tb->used) = flag;
*char_buf_ptr(tb, tb->used++) = ch;
return 1;
}
One rather simple change I found would actually avoid the warning
and would seem to actually give us better runtime behavior even
without KASAN:
diff --git a/include/linux/tty_flip.h b/include/linux/tty_flip.h
index c28dd523f96e..15d03a14ad0f 100644
--- a/include/linux/tty_flip.h
+++ b/include/linux/tty_flip.h@@ -26,7 +26,7 @@ static inline int tty_insert_flip_char(struct tty_port *port, *char_buf_ptr(tb, tb->used++) = ch; return 1; } - return tty_insert_flip_string_flags(port, &ch, &flag, 1); + return tty_insert_flip_string_fixed_flag(port, &ch, flag, 1); } static inline int tty_insert_flip_string(struct tty_port *port,
This reduces the stack frame size for kbd_event() to 1256 bytes, which is well within the limit, and it lets us keep the flag-less buffers across a 'tb->used >= tb->size' condition. Calling into tty_insert_flip_string_flags() today will allocate a flag buffer if there isn't already one, even when it is not needed.
quoted
I'm however afraid we'd have to mark a lot of static functions that way, depending on the aggressivity of gcc... I'd indeed really argue that gcc should consider stack usage when inlining. static int f(int foo) { char c[256]; g(c, foo); } is really not something that I'd want to see the compiler to inline.Why would not we want it be inlined? What we do not want us several calls having _separate_ instances of 'c' generated on the stack, all inlined calls should share 'c'. And of course if we have f1, f2, and f3 with c1, c2, and c3, GCC should not blow up the stack inlining and allocating stack for all 3 of them beforehand. But this all seems to me issue that should be solved in toolchain, not trying to play whack-a-mole with kernel sources.
The problem for the Samuel's example is that
a) the "--param asan-stack=1" option in KASAN does blow up the
stack, which is why the annotation is now called 'noinline_if_stackbloat'.
b) The toolchain cannot solve the problem, as most instances of the
problem (unlike kbd_put_queue) force the inlining unless you build
with the x86-specific CONFIG_OPTIMIZE_INLINING.
Arnd