Thread (54 messages) 54 messages, 6 authors, 2023-07-12

Re: [PATCH v9 23/42] Documentation/x86: Add CET shadow stack description

From: szabolcs.nagy@arm.com <hidden>
Date: 2023-06-20 09:18:11
Also in: linux-arch, linux-doc, linux-mm, lkml

The 06/19/2023 16:44, Edgecombe, Rick P wrote:
On Mon, 2023-06-19 at 09:47 +0100, szabolcs.nagy@arm.com wrote:
quoted
The 06/14/2023 16:57, Edgecombe, Rick P wrote:
quoted
On Wed, 2023-06-14 at 11:43 +0100, szabolcs.nagy@arm.com wrote:
quoted
i dont think you can add sigaltshstk later.

libgcc already has unwinder code for shstk and that cannot handle
discontinous shadow stack.
Are you referring to the existing C++ exception unwinding code that
expects a different signal frame format? Yea this is a problem, but
I
don't see how it's a problem with any solutions now that will be
harder
later. I mentioned it when I brought up all the app compatibility
problems.[0]
there is old unwinder code incompatible with the current patches,
but that was fixed. however the new unwind code assumes signal
entry pushes one extra token that just have to be popped from the
shstk. this abi cannot be expanded which means

1) kernel cannot push more tokens for more integrity checks
   (or to add whatever other features)

2) sigaltshstk cannot work.

if the unwinder instead interprets the token to be the old ssp and
either incssp or switch to that ssp (depending on continous or
discontinous shstk, which the unwinder can detect), then 1) and 2)
are fixed.

but currently the distributed unwinder binary is incompatible with
1) and 2) so sigaltshstk cannot be added later. breaking the unwind
abi is not acceptable.
Can you point me to what you are talking about? I tested adding fields
to the shadow stack on top of these changes. It worked with HJ's new
(unposted I think) C++ changes for gcc. Adding fields doesn't work with
the existing gcc because it assumes a fixed size.
if there is a fix that's good, i haven't seen it.

my point was that the current unwinder works with current kernel
patches, but does not allow future extensions which prevents
sigaltshstk to work. the unwinder is not versioned so this cannot
be fixed later. it only works if distros ensure shstk is disabled
until the unwinder is fixed. (however there is no way to detect
old unwinder if somebody builds gcc from source.)

also note that there is generic code in the unwinder that will
deal with this and likely the x86 patches will conflict with
arm and riscv etc patches that try to fix the same issue..
so posting patches on the tools side of the abi would be useful
at this point.
quoted
quoted
The problem is that gcc expects a fixed 8 byte sized shadow stack
signal frame. The format in these patches is such that it can be
expanded for the sake of supporting alt shadow stack later, but it
happens to be a fixed 8 bytes for now, so it will work seamlessly
with
these old gcc's. HJ has some patches to fix GCC to jump over a
dynamically sized shadow stack signal frame, but this of course
won't
stop old gcc's from generating binaries that won't work with an
expanded frame.

I was waffling on whether it would be better to pad the shadow
stack
[1] signal frame to start, this would break compatibility with any
binaries that use this -fnon-call-exceptions feature (if there are
any), but would set us up better for the future if we got away with
it.
i don't see how -fnon-call-exceptions is relevant.
It uses unwinder code that does assume a fixed shadow stack signal
frame size. Since gcc 8.5 I think. So these compilers will continue to
generate code that assumes a fixed frame size. This is one of the
limitations we have for not moving to a new elf bit.
how does "fixed shadow stack signal frame size" relates to
"-fnon-call-exceptions"?

if there were instruction boundaries within a function where the
ret addr is not yet pushed or already poped from the shstk then
the flag would be relevant, but since push/pop happens atomically
at function entry/return -fnon-call-exceptions makes no
difference as far as shstk unwinding is concerned.
quoted
you can unwind from a signal handler (this is not a c++ question
but unwind abi question) and in practice eh works e.g. if the
signal is raised (sync or async) in a frame where there are no
cleanup handlers registered. in practice code rarely relies on
this (because it's not valid in c++). the main user of this i
know of is the glibc cancellation implmentation. (that is special
in that it never catches the exception so ssp does not have to be
updated for things to work, but in principle the unwinder should
still verify the entries on shstk, otherwise the security
guarantees are broken and the cleanup handlers can be hijacked.
there are glibc abi issues that prevent fixing this, but in other
libcs this may be still relevant).
I'm not fully sure what you are trying to say here. The glibc shadow
you mentioned -fnon-call-exceptions and i'm saying even without that
unwinding from signal handler is relevant and has to track shstk and
ideally even update it for eh control transfer (i.e. properly switch
to a different shstk in case of discontinous shstk).
stack stuff that is there today supports unwinding through a signal
handler. The longjmp code (unlike fnon-call-exections) doesn't look at
the shstk signal frame. It just does INCSSP until it reaches its
desired SSP, not caring what it is INCSSPing over.
x86 longjmp has differnet problems (cannot handle discontinous
shstk now).

glibc cancellation is a mix of unwinding and special longjmp and
it is currently broken in that the unwind bit cannot verify the
return addresses. the unwinder does control transfer to cleanup
handlers so control flow hijack is possible in principle on a
corrupt stack (though i don't think cancellation is a practical
attack surface).
quoted
longjmp can support discontinous shadow stack without wrss.
the current code proposed to glibc does not, which is wrong
(it breaks altshstk and green thread users like qemu for no
good reason).

declaring things unsupported means you have to go around to
audit and mark binaries accordingly.
The idea that all apps can be supported without auditing has been
assumed to be impossible by everyone I've talked to, including the
GLIBC developers deeply versed in the architectural limitations of this
feature. So if you have a magic solution, then that is a notable claim
and I think you should propose it instead of just alluding to the fact
that there is one.
there is no magic, longjmp should be implemented as:

	target_ssp = read from jmpbuf;
	current_ssp = read ssp;
	for (p = target_ssp; p != current_ssp; p--) {
		if (*p == restore-token) {
			// target_ssp is on a different shstk.
			switch_shstk_to(p);
			break;
		}
	}
	for (; p != target_ssp; p++)
		// ssp is now on the same shstk as target.
		inc_ssp();

this is what setcontext is doing and longjmp can do the same:
for programs that always longjmp within the same shstk the first
loop is just p = current_ssp, but it also works when longjmp
target is on a different shstk assuming nothing is running on
that shstk, which is only possible if there is a restore token
on top.

this implies if the kernel switches shstk on signal entry it has
to add a restore-token on the switched away shstk.
The only non-WRSS "longjmp from an alt shadow stack solution" that I
can think of would have something like a new syscall performing some
limited shadow stack actions normally prohibited in userspace by the
there is setcontext and swapcontext already doing an shstk
switch, i don't see why you think longjmp is different and
needs magic syscalls or wrss.
architecture. We'd have to think through how this would impact the
security. There are a lot of security/compatibility tradeoffs to parse
in this. So also, just because something can be done, doesn't mean we
should do it. I think the philosophy at this point is, lets get the
basics working that can support most apps, and learn more about stuff
like where this bar is in the real world.
i think longjmp should really be discussed with libc devs,
not on the kernel list, since they know the practical
constraints and trade-offs better. however longjmp is
relevant for the signal abi design so it's not ideal to
push a linux abi and then have the libc side discussion
later..
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help