Re: [PATCH] coredump: fix unfreezable coredumping task
From: Michal Hocko <mhocko@kernel.org>
Date: 2016-10-05 09:17:46
Also in:
linux-fsdevel, linux-pm, lkml
On Tue 04-10-16 18:13:05, Oleg Nesterov wrote:
On 10/04, Michal Hocko wrote:quoted
On Fri 30-09-16 14:47:41, Oleg Nesterov wrote:quoted
On 09/30, Andrey Ryabinin wrote:quoted
@@ -423,7 +424,9 @@ static int coredump_wait(int exit_code, struct core_state *core_state) if (core_waiters > 0) { struct core_thread *ptr; + freezer_do_not_count(); wait_for_completion(&core_state->startup); + freezer_count();Agreed... we could probably even do --- x/fs/coredump.c +++ x/fs/coredump.c @@ -423,7 +423,13 @@ static int coredump_wait(int exit_code, if (core_waiters > 0) { struct core_thread *ptr; - wait_for_completion(&core_state->startup); + if (wait_for_completion_interruptible(&core_state->startup)) { + /* see the comment in dump_interrupted() */ + down_write(&mm->mmap_sem); + coredump_finish(mm, false); + up_write(&mm->mmap_sem); + return -EINTR; + } /* * Wait for all the threads to become inactive, so that * all the thread context (extended register state, likeThis looks like a very good idea to me. We really want to make the whole coredump_wait killable.Well, it is already killable.
Except wait_for_completion is not killable and the exiting tasks might be blocked in a !killable state blocking this one to continue. But...
And with the change above it can sleep in down_write(mmap_sem) and we really need this lock to abort, so it won't necessarily react to SIGKILL faster.
you are right that somebody might be holding mmap_sem and we cannot get rid of it here.
quoted
I guess this should help us to remove the hackish sig->flags & SIGNAL_GROUP_COREDUMP check from __task_will_free_mem.Why? This doesn't depend on "killable". __task_will_free_mem() checks this flag to detect the CLONE_VM processes which won't exit soon because they participate in the coredumping.
I just (wrongly) assumed that if we make this path killable completely we can guarantee a forward progress and get rid of SIGNAL_GROUP_COREDUMP check completely. But you are right this won't be sufficient. -- Michal Hocko SUSE Labs