Thread (24 messages) 24 messages, 6 authors, 2021-04-14

Re: [PATCH v7 0/6] x86: Improve Minimum Alternate Stack Size

From: Len Brown <lenb@kernel.org>
Date: 2021-03-19 18:13:46
Also in: linux-arch, lkml

On Wed, Mar 17, 2021 at 6:45 AM Ingo Molnar [off-list ref] wrote:

* Ingo Molnar [off-list ref] wrote:
quoted
* Chang S. Bae [off-list ref] wrote:
quoted
During signal entry, the kernel pushes data onto the normal userspace
stack. On x86, the data pushed onto the user stack includes XSAVE state,
which has grown over time as new features and larger registers have been
added to the architecture.

MINSIGSTKSZ is a constant provided in the kernel signal.h headers and
typically distributed in lib-dev(el) packages, e.g. [1]. Its value is
compiled into programs and is part of the user/kernel ABI. The MINSIGSTKSZ
constant indicates to userspace how much data the kernel expects to push on
the user stack, [2][3].

However, this constant is much too small and does not reflect recent
additions to the architecture. For instance, when AVX-512 states are in
use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains 2KB.

The bug report [4] explains this as an ABI issue. The small MINSIGSTKSZ can
cause user stack overflow when delivering a signal.
quoted
  uapi: Define the aux vector AT_MINSIGSTKSZ
  x86/signal: Introduce helpers to get the maximum signal frame size
  x86/elf: Support a new ELF aux vector AT_MINSIGSTKSZ
  selftest/sigaltstack: Use the AT_MINSIGSTKSZ aux vector if available
  x86/signal: Detect and prevent an alternate signal stack overflow
  selftest/x86/signal: Include test cases for validating sigaltstack
So this looks really complicated, is this justified?

Why not just internally round up sigaltstack size if it's too small?
This would be more robust, as it would fix applications that use
MINSIGSTKSZ but don't use the new AT_MINSIGSTKSZ facility.

I.e. does AT_MINSIGSTKSZ have any other uses than avoiding the
segfault if MINSIGSTKSZ is used to create a small signal stack?
I.e. if the kernel sees a too small ->ss_size in sigaltstack() it
would ignore ->ss_sp and mmap() a new sigaltstack instead and use that
for the signal handler stack.

This would automatically make MINSIGSTKSZ - and other too small sizes
work today, and in the future.

But the question is, is there user-space usage of sigaltstacks that
relies on controlling or reading the contents of the stack?

longjmp using programs perhaps?
For the legacy binary that requests a too-small sigaltstack, there are
several choices:

We could detect the too-small stack at sigaltstack(2) invocation and
return an error.
This results in two deal-killing problems:
First, some applications don't check the return value, so the check
would be fruitless.
Second, those that check and error-out may be programs that never
actually take the signal, and so we'd be causing a dusty binary to
exit, when it didn't exit on another system, or another kernel.

Or we could detect the too small stack at signal registration time.
This has the same two deal-killers as above.

Then there is the approach in this patch-set, which detects an
imminent stack overflow at run time.
It has neither of the two problems above, and the benefit that we now
prevent data corruption
that could have been happening on some systems already today.  The
down side is that the dusty binary
that does request the too-small stack can now die at run time.

So your idea of recognizing the problem and conjuring up a sufficient
stack is compelling,
since it would likely "just work", no matter how dumb the program.
But where would the
the sufficient stack come from -- is this a new kernel buffer, or is
there a way to abscond
some user memory?  I would expect a signal handler to look at the data
on its stack
and nobody else will look at that stack.  But this is already an
unreasonable program for
allocating a special signal stack in the first place :-/ So yes, one
could imagine the signal
handler could longjump instead of gracefully completing, and if this
specially allocated
signal stack isn't where the user planned, that could be trouble.

Another idea we discussed was to detect the potential overflow at run-time,
and instead of killing the process, just push the signal onto the
regular user stack.
this might actually work, but it is sort of devious; and it would not
work in the case
where the user overflowed their regular stack already, which may be
the most (only?)
compelling reason that they allocated and declared a special
sigaltstack in the first place...

-- 
Len Brown, Intel Open Source Technology Center
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help