Re: [RFC PATCH v2 17/27] x86/cet/shstk: User-mode shadow stack support
From: Yu-cheng Yu <hidden>
Date: 2018-07-13 18:06:48
Also in:
linux-api, linux-arch, linux-mm, lkml
On Wed, 2018-07-11 at 15:21 -0700, Andy Lutomirski wrote:
quoted
On Jul 11, 2018, at 2:51 PM, Jann Horn [off-list ref] wrote: On Wed, Jul 11, 2018 at 2:34 PM Andy Lutomirski [off-list ref] wrote:quoted
quoted
On Jul 11, 2018, at 2:10 PM, Jann Horn [off-list ref] wrote:quoted
On Tue, Jul 10, 2018 at 3:31 PM Yu-cheng Yu [off-list ref] wrote: This patch adds basic shadow stack enabling/disabling routines. A task's shadow stack is allocated from memory with VM_SHSTK flag set and read-only protection. The shadow stack is allocated to a fixed size. Signed-off-by: Yu-cheng Yu <redacted>[...]quoted
diff --git a/arch/x86/kernel/cet.c b/arch/x86/kernel/cet.c new file mode 100644 index 000000000000..96bf69db7da7 --- /dev/null +++ b/arch/x86/kernel/cet.c[...]quoted
+static unsigned long shstk_mmap(unsigned long addr, unsigned long len) +{ + struct mm_struct *mm = current->mm; + unsigned long populate; + + down_write(&mm->mmap_sem); + addr = do_mmap(NULL, addr, len, PROT_READ, + MAP_ANONYMOUS | MAP_PRIVATE, VM_SHSTK, + 0, &populate, NULL); + up_write(&mm->mmap_sem); + + if (populate) + mm_populate(addr, populate); + + return addr; +}[...]quoted
quoted
Should the kernel enforce that two shadow stacks must have a guard page between them so that they can not be directly adjacent, so that if you have too much recursion, you can't end up corrupting an adjacent shadow stack?I think the answer is a qualified “no”. I would like to instead enforce a general guard page on all mmaps that don’t use MAP_FORCE. We *might* need to exempt any mmap with an address hint for compatibility.I like this idea a lot.quoted
My commercial software has been manually adding guard pages on every single mmap done by tcmalloc for years, and it has caught a couple bugs and costs essentially nothing. Hmm. Linux should maybe add something like Windows’ “reserved” virtual memory. It’s basically a way to ask for a VA range that explicitly contains nothing and can be subsequently be turned into something useful with the equivalent of MAP_FORCE.What's the benefit over creating an anonymous PROT_NONE region? That the kernel won't have to scan through the corresponding PTEs when tearing down the mapping?Make it more obvious what’s happening and avoid accounting issues? What I’ve actually used is MAP_NORESERVE | PROT_NONE, but I think this still counts against the VA rlimit. But maybe that’s actually the desired behavior.
We can put a NULL at both ends of a SHSTK to guard against corruption. Yu-cheng -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html