Re: [PATCH v9 23/42] Documentation/x86: Add CET shadow stack description

From: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com>
Date: 2023-06-19 16:45:47
Also in: linux-arch, linux-doc, linux-mm, lkml

On Mon, 2023-06-19 at 09:47 +0100, szabolcs.nagy@arm.com wrote:

The 06/14/2023 16:57, Edgecombe, Rick P wrote:

quoted

On Wed, 2023-06-14 at 11:43 +0100, szabolcs.nagy@arm.com wrote:

quoted

i dont think you can add sigaltshstk later.

libgcc already has unwinder code for shstk and that cannot handle
discontinous shadow stack.

Are you referring to the existing C++ exception unwinding code that
expects a different signal frame format? Yea this is a problem, but
I
don't see how it's a problem with any solutions now that will be
harder
later. I mentioned it when I brought up all the app compatibility
problems.[0]

there is old unwinder code incompatible with the current patches,
but that was fixed. however the new unwind code assumes signal
entry pushes one extra token that just have to be popped from the
shstk. this abi cannot be expanded which means

1) kernel cannot push more tokens for more integrity checks
   (or to add whatever other features)

2) sigaltshstk cannot work.

if the unwinder instead interprets the token to be the old ssp and
either incssp or switch to that ssp (depending on continous or
discontinous shstk, which the unwinder can detect), then 1) and 2)
are fixed.

but currently the distributed unwinder binary is incompatible with
1) and 2) so sigaltshstk cannot be added later. breaking the unwind
abi is not acceptable.

Can you point me to what you are talking about? I tested adding fields
to the shadow stack on top of these changes. It worked with HJ's new
(unposted I think) C++ changes for gcc. Adding fields doesn't work with
the existing gcc because it assumes a fixed size.

quoted

The problem is that gcc expects a fixed 8 byte sized shadow stack
signal frame. The format in these patches is such that it can be
expanded for the sake of supporting alt shadow stack later, but it
happens to be a fixed 8 bytes for now, so it will work seamlessly
with
these old gcc's. HJ has some patches to fix GCC to jump over a
dynamically sized shadow stack signal frame, but this of course
won't
stop old gcc's from generating binaries that won't work with an
expanded frame.

I was waffling on whether it would be better to pad the shadow
stack
[1] signal frame to start, this would break compatibility with any
binaries that use this -fnon-call-exceptions feature (if there are
any), but would set us up better for the future if we got away with
it.

i don't see how -fnon-call-exceptions is relevant.

It uses unwinder code that does assume a fixed shadow stack signal
frame size. Since gcc 8.5 I think. So these compilers will continue to
generate code that assumes a fixed frame size. This is one of the
limitations we have for not moving to a new elf bit.

you can unwind from a signal handler (this is not a c++ question
but unwind abi question) and in practice eh works e.g. if the
signal is raised (sync or async) in a frame where there are no
cleanup handlers registered. in practice code rarely relies on
this (because it's not valid in c++). the main user of this i
know of is the glibc cancellation implmentation. (that is special
in that it never catches the exception so ssp does not have to be
updated for things to work, but in principle the unwinder should
still verify the entries on shstk, otherwise the security
guarantees are broken and the cleanup handlers can be hijacked.
there are glibc abi issues that prevent fixing this, but in other
libcs this may be still relevant).

I'm not fully sure what you are trying to say here. The glibc shadow
stack stuff that is there today supports unwinding through a signal
handler. The longjmp code (unlike fnon-call-exections) doesn't look at
the shstk signal frame. It just does INCSSP until it reaches its
desired SSP, not caring what it is INCSSPing over.

quoted

On one hand we are already juggling some compatibility issues so
maybe
it's not too much worse, but on the other hand the kernel is trying
its
best to be as compatible as it can given the situation. It doesn't
*need* to break this compatibility at this point.

In the end I thought it was better to deal with it later.

quoted

 (may affect longjmp too depending on
how it is implemented)

glibc's longjmp ignores anything everything it skips over and just
does
INCSSP until it gets back to the setjmp point. So it is not
affected by
the shadow stack signal frame format. I don't think we can support
longjmping off an alt shadow stack unless we enable WRSS or get the
kernel's help. So this was to be declared as unsupported.

longjmp can support discontinous shadow stack without wrss.
the current code proposed to glibc does not, which is wrong
(it breaks altshstk and green thread users like qemu for no
good reason).

declaring things unsupported means you have to go around to
audit and mark binaries accordingly.

The idea that all apps can be supported without auditing has been
assumed to be impossible by everyone I've talked to, including the
GLIBC developers deeply versed in the architectural limitations of this
feature. So if you have a magic solution, then that is a notable claim
and I think you should propose it instead of just alluding to the fact
that there is one.

The only non-WRSS "longjmp from an alt shadow stack solution" that I
can think of would have something like a new syscall performing some
limited shadow stack actions normally prohibited in userspace by the
architecture. We'd have to think through how this would impact the
security. There are a lot of security/compatibility tradeoffs to parse
in this. So also, just because something can be done, doesn't mean we
should do it. I think the philosophy at this point is, lets get the
basics working that can support most apps, and learn more about stuff
like where this bar is in the real world.

quoted

we can change the unwinder now to know how to switch shstk when
it unwinds the signal frame and backport that to systems that
want to support shstk. or we can introduce a new elf marking
scheme just for sigaltshstk when it is added so incompatibility
can be detected. or we simply not support unwinding with
sigaltshstk which would make it pretty much useless in practice.

Yea, I was thinking along the same lines. Someday we could easily
need
some new marker. Maybe because we want to add something, or maybe
because of the pre-existing userspace. In that case, this
implementation will get the ball rolling and we can learn more
about
how shadow stack will be used. So if we need to break compatibility
with any apps, we would not really be in a different situation than
we
are already in (if we are going to take proper care to not break
userspace). So if/when that happens all the learning's can go into
the
clean break.

But if it's not clear, unwinder's that properly use the format in
these
patches should work from an alt shadow stack implemented like that
RFC
linked earlier in the thread. At least it will be able to read back
the
shadow stack starting from the alt shadow stack, it can't actually
resume control flow from where it unwound to. For that we need WRSS
or
some kernel help.

wrss is not needed to resume control flow on a different shstk.

WRSS lets you resume control flow at arbitrarily points by writing your
own restore token. Otherwise there are restrictions.

(if you needed wrss then the map_shadow_stack would be useless.)

map_shadow_stack is usually prepopulated with a token, otherwise it
does need WRSS to create one on it.

quoted

[0]
https://lore.kernel.org/lkml/7d8133c7e0186bdaeb3893c1c808148dc0d11945.camel@intel.com/ (local)

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help