Thread (148 messages) 148 messages, 17 authors, 2022-06-09

Re: [PATCH 00/35] Shadow stacks for userspace

From: Thomas Gleixner <hidden>
Date: 2022-02-03 21:08:01
Also in: linux-arch, linux-doc, linux-mm, lkml

Rick,

On Sun, Jan 30 2022 at 13:18, Rick Edgecombe wrote:
This is a slight reboot of the userspace CET series. I will be taking over the 
series from Yu-cheng. Per some internal recommendations, I’ve reset the version
number and am calling it a new series. Hopefully, it doesn’t cause
confusion.
That's fine as it seems to be a major change in course, so a reset to V1
is justified. Don't worry about confusion, we can easily confuse ourself
with minor things than that version reset :)
The new plan is to upstream only userspace Shadow Stack support at this point. 
IBT can follow later, but for now I’ll focus solely on the most in-demand and
widely available (with the feature on AMD CPUs now) part of CET.
We just have to keep in IBT mind so that we don't add roadblocks which
we regret some time later.
I thought as part of this reset, it might be useful to more fully write-up the 
design and summarize the history of the previous CET series. So this slightly
long cover letter does that. The "Updates" section has the changes, if anyone
doesn't want the history.
Thanks for that lengthy writeup. It's appreciated. There is too much
confusion already so a coherent summary is helpful.
Why is Shadow Stack Wanted
==========================
The main use case for userspace shadow stack is providing protection against 
return oriented programming attacks. Fedora and Ubuntu already have many/most 
packages enabled for shadow stack.
Which is unfortunately part of the overall problem ...
History
=======
The branding “CET” really consists of two features: “Shadow Stack” and 
“Indirect Branch Tracking”. They both restrict previously allowed, but rarely 
valid behaviors and require userspace to change to avoid these behaviors before 
enabling the protection. These raw HW features need to be assembled into a 
software solution across userspace and kernel in order to add security value.
The kernel part of this solution has evolved iteratively starting with a lengthy
RFC period. 

Until now, the enabling effort was trying to support both Shadow Stack and IBT. 
This history will focus on a few areas of the shadow stack development history 
that I thought stood out.

	Signals
	-------
	Originally signals placed the location of the shadow stack restore 
	token inside the saved state on the stack. This was problematic from a 
	past ABI promises perspective. So the restore location was instead just 
	assumed from the shadow stack pointer. This works because in normal 
	allowed cases of calling sigreturn, the shadow stack pointer should be 
	right at the restore token at that time. There is no alternate shadow 
	stack support. If an alt shadow stack is added later we would
	need to
So how is that going to work? altstack is not an esoteric corner case.
	Enabling Interface
	------------------
	For the entire history of the original CET series, the design was to 
	enable shadow stack automatically if the feature bit was detected in 
	the elf header. Then it was userspace’s responsibility to turn it off 
	via an arch_prctl() if it was not desired, and this was handled by the 
	glibc dynamic loader. Glibc’s standard behavior (when CET if configured 
	is to leave shadow stack enabled if the executable and all linked 
	libraries are marked with shadow stacks.

	Many distros (Fedora and others) have binaries already marked with 
	shadow stack, waiting for kernel support. Unfortunately their glibc 
	binaries expect the original arch_prctl() interface for allocating 
	shadow stacks, as those changes were pushed ahead of kernel support. 
	The net result of it all is, when updating to a kernel with shadow 
	stack these binaries would suddenly get shadow stack enabled and expect 
	the arch_prctl() interface to be there. And so calls to makecontext() 
	will fail, resulting in visible breakages. This series deals with this 
	problem as described below in "Updates".
I'm really impressed by the well thought out coordination on the glibc and
distro side. Designed by committee never worked ...
Updates
=======
These updates were mostly driven by public comments, but a lot of the design 
elements are new. I would like some extra scrutiny on the updates.

	New syscall for Shadow Stack Allocation
	---------------------------------------
	A new syscall is added for allocating shadow stacks to replace 
	PROT_SHADOW_STACK. Several options were considered, as described in the 
	“x86/cet/shstk: Introduce map_shadow_stack syscall”.

	Xsave Managed Supervisor State Modifications
	--------------------------------------------
	The shadow stack feature requires the kernel to modify xsaves managed 
	state. On one of the last versions of Yu-cheng’s series Boris had 
	commented on the pattern it was using to do this not necessarily being 
	ideal. The pattern was to force a restore to the registers and always 
	do the modification there. Then Thomas did an overhaul of the fpu code, 
	part of which consisted of making raw access to the xsave buffer 
	private to the fpu code. So this series tries to expose access again, 
	and in a way that addresses Boris’ comments.

	The method is to provide functions like wmsrl/rdmsrl, but that can 
	direct the operation to the correct location (registers or buffer), 
	while giving the proper notice to the fpu subsystem so things don’t get 
	clobbered or corrupted.

	In the past a solution like this was discussed as part of the PASID 
	series, and Thomas was not in favor. In CET’s case there is a more 
	logic around the CET MSR’s than in PASID's, and wrapping this logic 
	minimizes near identical open coded logic needed to do this more 
	efficiently. In addition it resolves the above described problem of 
	having no access to the xsave buffer. So it is being put forward here 
	under the supposition that CET’s usage may lead to a different 
	conclusion, not to try to ignore past direction.

	The user interrupt series has similar needs as CET, and will also use
	this internal interface if it’s found acceptable.
I'll have a look.
	Switch Enabling Interface
	-------------------------
	But there are existing downsides to automatic elf header processing 
	based enabling. The elf header feature spec is not defined by the 
	kernel and there are proposals to expand it to describe additional 
	logic. A simpler interface where the kernel is simply told what to 
	enable, and leaves all the decision making to userspace, is more 
	flexible for userspace and simpler for the kernel. There also already 
	needs to be an ARCH_X86_FEATURE_ENABLE arch_prctl() for WRSS (and 
	likely LAM will use it too), so it avoids there being two ways to turn 
	on these types of features. The only tricky part for shadow stack, is 
	that it has to be enabled very early. Wherever the shadow stack is 
	enabled, the app cannot return from that point, otherwise there will be 
	a shadow stack violation. It turns out glibc can enable shadow stack 
	this early, so it works nicely. So not automatically enabling any 
	features in the elf header will cleanly disable all old binaries, which 
	expect the kernel to enable CET features automatically. Then after the 
	kernel changes are upstream, glibc can be updated to use the new
	interface. This is the solution implemented in this series.
Makes sense.
	Expand Commit Logs
	------------------
	As part of spinning up on this series, I found some of the commit logs 
	did not describe the changes in enough detail for me understand their 
	purpose. I tried to expand the logs and comments, where I had to go 
	digging. Hopefully it’s useful.
Proper changelogs are always appreciated.
	
	Limit to only Intel Processors
	------------------------------
	Shadow stack is supported on some AMD processors, but this revision 
	(with expanded HW usage and xsaves changes) has only has been tested on 
	Intel ones. So this series has a patch to limit shadow stack support to 
	Intel processors. Ideally the patch would not even make it to mainline, 
	and should be dropped as soon as this testing is done. It's included 
	just in case.
Ha. I can give you access to an AMD machine with CET SS supported :)
Future Work
===========
Even though this is now exclusively a shadow stack series, there is still some 
remaining shadow stack work to be done.

	Ptrace
	------
	Early in the series, there was a patch to allow IA32_U_CET and
	IA32_PL3_SSP to be set. This patch was dropped and planned as a follow
	up to basic support, and it remains the plan. It will be needed for
	in-progress gdb support.
It's pretty much a prerequisite for enabling it, right?
	CRIU Support
	------------
	In the past there was some speculation on the mailing list about 
	whether CRIU would need to be taught about CET. It turns out, it does. 
	The first issue hit is that CRIU calls sigreturn directly from its 
	“parasite code” that it injects into the dumper process. This violates
	this shadow stack implementation’s protection that intends to prevent
	attackers from doing this.

	With so many packages already enabled with shadow stack, there is 
	probably desire to make it work seamlessly. But in the meantime if 
	distros want to support shadow stack and CRIU, users could manually 
	disabled shadow stack via “GLIBC_TUNABLES=glibc.cpu.x86_shstk=off” for 
	a process they will wants to dump. It’s not ideal.

	I’d like to hear what people think about having shadow stack in the 
	kernel without this resolved. Nothing would change for any users until 
	they enable shadow stack in the kernel and update to a glibc configured
	with CET. Should CRIU userspace be solved before kernel support?
Definitely yes. Making CRIU users add a glibc tunable is not really an
option. We can't break CRIU systems with a kernel upgrade.

Thanks,

        tglx

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help