Thread (22 messages) 22 messages, 4 authors, 2020-03-21

Re: [PATCH v2 5/5] exec: Add a exec_update_mutex to replace cred_guard_mutex

From: Eric W. Biederman <hidden>
Date: 2020-03-12 14:40:38
Also in: linux-api, linux-doc, linux-fsdevel, linux-mm, lkml

Possibly related (same subject, not in this thread)

Kirill Tkhai [off-list ref] writes:
On 12.03.2020 15:24, Eric W. Biederman wrote:
quoted
Kirill Tkhai [off-list ref] writes:
quoted
On 09.03.2020 00:38, Eric W. Biederman wrote:
quoted
The cred_guard_mutex is problematic.  The cred_guard_mutex is held
over the userspace accesses as the arguments from userspace are read.
The cred_guard_mutex is held of PTRACE_EVENT_EXIT as the the other
threads are killed.  The cred_guard_mutex is held over
"put_user(0, tsk->clear_child_tid)" in exit_mm().

Any of those can result in deadlock, as the cred_guard_mutex is held
over a possible indefinite userspace waits for userspace.

Add exec_update_mutex that is only held over exec updating process
with the new contents of exec, so that code that needs not to be
confused by exec changing the mm and the cred in ways that can not
happen during ordinary execution of a process.

The plan is to switch the users of cred_guard_mutex to
exec_udpate_mutex one by one.  This lets us move forward while still
being careful and not introducing any regressions.

Link: https://lore.kernel.org/lkml/20160921152946.GA24210@dhcp22.suse.cz/ (local)
Link: https://lore.kernel.org/lkml/AM6PR03MB5170B06F3A2B75EFB98D071AE4E60@AM6PR03MB5170.eurprd03.prod.outlook.com/ (local)
Link: https://lore.kernel.org/linux-fsdevel/20161102181806.GB1112@redhat.com/ (local)
Link: https://lore.kernel.org/lkml/20160923095031.GA14923@redhat.com/ (local)
Link: https://lore.kernel.org/lkml/20170213141452.GA30203@redhat.com/ (local)
Ref: 45c1a159b85b ("Add PTRACE_O_TRACEVFORKDONE and PTRACE_O_TRACEEXIT facilities.")
Ref: 456f17cd1a28 ("[PATCH] user-vm-unlock-2.5.31-A2")
Signed-off-by: "Eric W. Biederman" <redacted>
---
 fs/exec.c                    | 9 +++++++++
 include/linux/sched/signal.h | 9 ++++++++-
 init/init_task.c             | 1 +
 kernel/fork.c                | 1 +
 4 files changed, 19 insertions(+), 1 deletion(-)
diff --git a/fs/exec.c b/fs/exec.c
index d820a7272a76..ffeebb1f167b 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1014,6 +1014,7 @@ static int exec_mmap(struct mm_struct *mm)
 {
 	struct task_struct *tsk;
 	struct mm_struct *old_mm, *active_mm;
+	int ret;
 
 	/* Notify parent that we're no longer interested in the old VM */
 	tsk = current;
@@ -1034,6 +1035,11 @@ static int exec_mmap(struct mm_struct *mm)
 			return -EINTR;
 		}
 	}
+
+	ret = mutex_lock_killable(&tsk->signal->exec_update_mutex);
+	if (ret)
+		return ret;
You missed old_mm->mmap_sem unlock. See here:
Duh.  Thank you.

I actually need to switch the lock ordering here, and I haven't yet
because my son was sick yesterday.
There is some fundamental problem with your patch, since the below fires in 100% cases
on current linux-next:
Thank you.

I have just backed this out of linux-next for now because it is clearly
flawed.

You make some good points about the recursion.  I will go back to the
drawing board and see what I can work out.

quoted hunk
[   22.838717] kernel BUG at fs/exec.c:1474!
diff --git a/fs/exec.c b/fs/exec.c
index 47582cd97f86..0f77f8c94905 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1470,8 +1470,10 @@ static void free_bprm(struct linux_binprm *bprm)
 {
 	free_arg_pages(bprm);
 	if (bprm->cred) {
-		if (!bprm->mm)
+		if (!bprm->mm) {
+			BUG_ON(!mutex_is_locked(&current->signal->exec_update_mutex));
 			mutex_unlock(&current->signal->exec_update_mutex);
+		}
 		mutex_unlock(&current->signal->cred_guard_mutex);
 		abort_creds(bprm->cred);
 	}
@@ -1521,6 +1523,7 @@ void install_exec_creds(struct linux_binprm *bprm)
 	 * credentials; any time after this it may be unlocked.
 	 */
 	security_bprm_committed_creds(bprm);
+	BUG_ON(!mutex_is_locked(&current->signal->exec_update_mutex));
 	mutex_unlock(&current->signal->exec_update_mutex);
 	mutex_unlock(&current->signal->cred_guard_mutex);
 }

---------------------------------------------------------------------------------------------
First time the mutex is unlocked in:

exec_binprm()->search_binary_handler()->.load_binary->install_exec_creds()

Then exec_binprm()->search_binary_handler()->.load_binary->flush_old_exec() clears mm:

        bprm->mm = NULL;        

Second time the mutex is unlocked in free_bprm():

	if (bprm->cred) {
                if (!bprm->mm)
                        mutex_unlock(&current->signal->exec_update_mutex);

My opinion is we should not relay on side indicators like bprm->mm. Better you may
introduce struct linux_binprm::exec_update_mutex_is_locked. So the next person dealing
with this after you won't waste much time on diving into this. Also, if someone decides
to change the place, where bprm->mm is set into NULL, this person will bump into hell
of dependences between unrelated components like your newly introduced mutex.

So, I'm strongly for *struct linux_binprm::exec_update_mutex_is_locked*, since this improves
modularity.
Am I wrong or is that also a problem with cred_guard_mutex?

Eric
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help