Re: [PATCH v2] mm,oom: Allow SysRq-f to always select !TIF_MEMDIE thread group.
From: Michal Hocko <hidden>
Date: 2016-06-01 07:34:45
On Wed 01-06-16 06:35:30, Tetsuo Handa wrote:
Michal Hocko wrote:quoted
On Sun 29-05-16 01:25:14, Tetsuo Handa wrote:quoted
There has been three problems about SysRq-f (manual invocation of the OOM killer) case. To make description simple, this patch assumes situation where the OOM reaper is not called (because the OOM victim's mm is shared by unkillable threads) or not available (due to kthread_run() failure or CONFIG_MMU=n). First is that moom_callback() is not called by moom_work under OOM livelock situation because it does not have a dedicated WQ like vmstat_wq. This problem is not fixed yet.Why do you mention it in the changelog when it is not related to the patch then?Just we won't forget about it.
OK, then this belongs to a cover letter. Discussing unrelated things in the patch description might end up being just confusing.
quoted
Btw. you can (ab)use oom_reaper for that purpose. The patch would be quite trivial.How do you handle CONFIG_MMU=n case?
void schedule_sysrq_oom(void)
{
if (IS_ENABLED(CONFIG_MMU) && oom_reaper_th)
kick_oom_reaper()
else
schedule_work(&moom_work);
}
[...]quoted
quoted
But SysRq-f case will select such thread group due to returning OOM_SCAN_OK. This patch makes sure that oom_badness() is skipped by making oom_scan_process_group() to return OOM_SCAN_CONTINUE for SysRq-f case.I am OK with this part. I was suggesting something similar except I wanted to skip over tasks which have fatal_signal_pending and that part got nacked by David AFAIR. Could you make this a separate patch, please?I think it is better to change both part with this patch.
They are semantically different (one is sysrq specific while the other is not) so I would prefer to split them up.
quoted
quoted
Third is that oom_kill_process() chooses a thread group which already has a TIF_MEMDIE thread when the candidate select_bad_process() chose has children because oom_badness() does not take TIF_MEMDIE into account. This patch checks child->signal->oom_victims before calling oom_badness() if oom_kill_process() was called by SysRq-f case. This resembles making sure that oom_badness() is skipped by returning OOM_SCAN_CONTINUE.This makes sense to me as well but why should be limit this to sysrq case? Does it make any sense to select a child which already got killed for normal OOM killer? Anyway I think it would be better to split this into its own patch as well.The reason is described in next paragraph. Do we prefer immediately killing all children of the allocating task?
I do not think we want to select them _all_. We haven't been doing that and I do not see a reason we should start now. But it surely doesn't make any sense to select a task which has already TIF_MEMDIE set. -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>