Thread (34 messages) 34 messages, 5 authors, 2024-08-16

Re: [PATCH RFT v8 4/9] fork: Add shadow stack support to clone3()

From: Catalin Marinas <catalin.marinas@arm.com>
Date: 2024-08-16 15:38:55
Also in: linux-kselftest, lkml

On Fri, Aug 16, 2024 at 02:52:28PM +0000, Edgecombe, Rick P wrote:
On Fri, 2024-08-16 at 09:44 +0100, Catalin Marinas wrote:
quoted
quoted
After a token is consumed normally, it doesn't set it to zero. Instead it
sets it to a "previous-ssp token". I don't think we actually want to do that here
though because it involves the old SSP, which doesn't really apply in this
case. I don't see any problem with zero, but was there any special thinking behind
it?
BTW, since it's the parent setting up the shadow stack in its own
address space before forking, I think at least the read can avoid
access_remote_vm() and we could do it earlier, even before the new
process is created.
Hmm. Makes sense. It's a bit racy since the parent could consume that token from
another thread, but it would be a race in any case.
More on the race below. If we handle it properly, we don't need the
separate checks.
quoted
quoted
quoted
+       if (access_remote_vm(mm, addr, &val, sizeof(val),
+                            FOLL_FORCE | FOLL_WRITE) != sizeof(val))
+               goto out;
The GUPs still seem a bit unfortunate for a couple reasons:
  - We could do a CMPXCHG version and are just not (I see ARM has identical
code in gcs_consume_token()). It's not the only race like this though FWIW.
  - I *think* this is the only unprivileged FOLL_FORCE that can write to the
current process in the kernel. As is, it could be used on normal RO
mappings, at
least in a limited way. Maybe another point for the VMA check. We'd want to
check that it is normal shadow stack?
  - Lingering doubts about the wisdom of doing GUPs during task creation.
I don't like the access_remote_vm() either. In the common (practically
only) case with CLONE_VM, the mm is actually current->mm, so no need for
a GUP.
On the x86 side, we don't have a shadow stack access CMPXCHG. We will have to
GUP and do a normal CMPXCHG off of the direct map to handle it fully properly in
any case (CLONE_VM or not).
I guess we could do the same here and for the arm64 gcs_consume_token().
Basically get_user_page_vma_remote() gives us the page together with the
vma that you mentioned needs checking. We can then do a cmpxchg directly
on the page_address(). It's probably faster anyway than doing GUP twice.

-- 
Catalin
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help