Re: [PATCH v8 9/9] seccomp: implement SECCOMP_FILTER_FLAG_TSYNC
From: Andy Lutomirski <luto@amacapital.net>
Date: 2014-06-25 18:10:06
Also in:
linux-arch, linux-arm-kernel, linux-mips, lkml
On Wed, Jun 25, 2014 at 10:57 AM, Kees Cook [off-list ref] wrote:
On Wed, Jun 25, 2014 at 10:24 AM, Oleg Nesterov [off-list ref] wrote:quoted
On 06/25, Kees Cook wrote:quoted
On Wed, Jun 25, 2014 at 9:52 AM, Oleg Nesterov [off-list ref] wrote:quoted
Yes, at least this should close the race with suid-exec. And there are no other users. Except apparmor, and I hope you will check it because I simply do not know what it does ;)quoted
I wonder if changes to nnp need to "flushed" during syscall entry instead of getting updated externally/asynchronously? That way it won't be out of sync with the seccomp mode/filters. Perhaps secure computing needs to check some (maybe seccomp-only) atomic flags and flip on the "real" nnp if found?Not sure I understand you, could you clarify?Instead of having TSYNC change the nnp bit, it can set a new flag, say: task->seccomp.flags |= SECCOMP_NEEDS_NNP; This would be set along with seccomp.mode, seccomp.filter, and TIF_SECCOMP. Then, during the next secure_computing() call that thread makes, it would check the flag: if (task->seccomp.flags & SECCOMP_NEEDS_NNP) task->nnp = 1; This means that nnp couldn't change in the middle of a running syscall.Aha, so you were worried about the same thing. Not sure we need this, but at least I understand you and...quoted
Hmmm. Perhaps this doesn't solve anything, though? Perhaps my proposal above would actually make things worse, since now we'd have a thread with seccomp set up, and no nnp. If it was in the middle of exec, we're still causing a problem.Yes ;)quoted
I think we'd also need a way to either delay the seccomp changes, or to notice this condition during exec. Bleh.Hmm. confused again,I mean to suggest that the tsync changes would be stored in each thread, but somewhere other than the true seccomp struct, but with TIF_SECCOMP set. When entering secure_computing(), current would check for the "changes to sync", and apply them, then start the syscall. In this way, we can never race a syscall (like exec).
I'm not sure that helps. If you set a pending filter part-way through exec, and exec copies that pending filter but doesn't notice NNP, then there's an exploitable race.
quoted
quoted
What actually happens with a multi-threaded process calls exec? I assume all the other threads are destroyed?Yes. But this is the point-of-no-return, de_thread() is called after the execing thared has already passed (say) check_unsafe_exec(). However, do_execve() takes cred_guard_mutex at the start in prepare_bprm_creds() and drops it in install_exec_creds(), so it should solve the problem?I can't tell yet. I'm still trying to understand the order of operations here. It looks like de_thread() takes the sighand lock. do_execve_common does: prepare_bprm_creds (takes cred_guard_mutex) check_unsafe_exec (checks nnp to set LSM_UNSAFE_NO_NEW_PRIVS) prepare_binprm (handles suid escalation, checks nnp separately) security_bprm_set_creds (checks LSM_UNSAFE_NO_NEW_PRIVS) exec_binprm load_elf_binary flush_old_exec de_thread (takes and releases sighand->lock) install_exec_creds (releases cred_guard_mutex) I don't see a way to use cred_guard_mutex during tsync (which holds sighand->lock) without dead-locking. What were you considering here?
Grab cred_guard_mutex and then sighand->lock, perhaps?
-Kees -- Kees Cook Chrome OS Security
-- Andy Lutomirski AMA Capital Management, LLC