Thread (28 messages) 28 messages, 4 authors, 2017-08-23

Re: [v5 2/4] mm, oom: cgroup-aware OOM killer

From: Roman Gushchin <hidden>
Date: 2017-08-23 16:21:06
Also in: linux-mm, lkml

Hi Johannes!

Thank you for review!

I do agree with most of the comments, and I will address them in v6.
I'll post it soon.

Please, find some comments below.

On Tue, Aug 22, 2017 at 01:03:44PM -0400, Johannes Weiner wrote:
Hi Roman,

great work! This looks mostly good to me now. Below are some nitpicks
concerning naming and code layout, but nothing major.
quoted
+
+	css_task_iter_start(&memcg->css, 0, &it);
+	while ((task = css_task_iter_next(&it))) {
+		/*
+		 * If there are no tasks, or all tasks have oom_score_adj set
+		 * to OOM_SCORE_ADJ_MIN and oom_kill_all_tasks is not set,
+		 * don't select this memory cgroup.
+		 */
+		if (!elegible &&
+		    (memcg->oom_kill_all_tasks ||
+		     task->signal->oom_score_adj != OOM_SCORE_ADJ_MIN))
+			elegible = 1;
This is a little awkward to read. How about something like this:

	/*
	 * When killing individual tasks, we respect OOM score adjustments:
	 * at least one task in the group needs to be killable for the group
	 * to be oomable.
	 *
	 * Also check that previous OOM kills have finished, and abort if
	 * there are any pending OOM victims.
	 */
	oomable = memcg->oom_kill_all_tasks;
	while ((task = css_task_iter_next(&it))) {
		if (!oomable && task->signal_oom_score_adj != OOM_SCORE_ADJ_MIN)
			oomable = 1;
quoted
+		if (tsk_is_oom_victim(task) &&
+		    !test_bit(MMF_OOM_SKIP, &task->signal->oom_mm->flags)) {
+			elegible = -1;
+			break;
+		}
+	}
+	css_task_iter_end(&it);
We ignore oom_score_adj if oom_kill_all_tasks is set, it's
not reflected in your version. Anyway, I've moved the comments block
outside and rephrased it to make more clear.
etc.
quoted
+
+	return elegible > 0 ? memcg_oom_badness(memcg, nodemask) : elegible;
I find these much easier to read if broken up, even if it's more LOC:

	if (eligible <= 0)
		return eligible;

	return memcg_oom_badness(memcg, nodemask);
quoted
+static void select_victim_memcg(struct mem_cgroup *root, struct oom_control *oc)
+{
+	struct mem_cgroup *iter, *parent;
+
+	for_each_mem_cgroup_tree(iter, root) {
+		if (memcg_has_children(iter)) {
+			iter->oom_score = 0;
+			continue;
+		}
+
+		iter->oom_score = oom_evaluate_memcg(iter, oc->nodemask);
+		if (iter->oom_score == -1) {
Please add comments to document the special returns. Maybe #defines
would be clearer, too.
quoted
+			oc->chosen_memcg = (void *)-1UL;
+			mem_cgroup_iter_break(root, iter);
+			return;
+		}
+
+		if (!iter->oom_score)
+			continue;
Same here.

Maybe a switch would be suitable to handle the abort/no-score cases. 
Not sure about switch/defines, but I've added several comment blocks
to describe possible return values, as well as their handling.
Hope, it will be enough.
quoted
 static int memory_events_show(struct seq_file *m, void *v)
 {
 	struct mem_cgroup *memcg = mem_cgroup_from_css(seq_css(m));
@@ -5310,6 +5512,12 @@ static struct cftype memory_files[] = {
 		.write = memory_max_write,
 	},
 	{
+		.name = "oom_kill_all_tasks",
+		.flags = CFTYPE_NOT_ON_ROOT,
+		.seq_show = memory_oom_kill_all_tasks_show,
+		.write = memory_oom_kill_all_tasks_write,
+	},
This name is quite a mouthful and reminiscent of the awkward v1
interface names. It doesn't really go well with the v2 names.

How about memory.oom_group?
I'd prefer to have something more obvious. I've renamed
memory.oom_kill_all_tasks to memory.oom_kill_all, which was earlier suggested
by Vladimir. Are you ok with it?

Thanks!

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help