Thread (8 messages) 8 messages, 3 authors, 2012-11-09

Re: [PATCH V3] memcg, oom: provide more precise dump info while memcg oom happening

From: Michal Hocko <hidden>
Date: 2012-11-09 10:50:57
Also in: linux-mm, lkml

On Fri 09-11-12 18:23:07, Sha Zhengju wrote:
On 11/09/2012 12:25 AM, Michal Hocko wrote:
quoted
On Thu 08-11-12 23:52:47, Sha Zhengju wrote:
[...]
quoted
quoted
+	for (i = 0; i<  MEM_CGROUP_STAT_NSTATS; i++) {
+		long long val = 0;
+		if (i == MEM_CGROUP_STAT_SWAP&&  !do_swap_account)
+			continue;
+		for_each_mem_cgroup_tree(mi, memcg)
+			val += mem_cgroup_read_stat(mi, i);
+		printk(KERN_CONT "%s:%lldKB ", mem_cgroup_stat_names[i], K(val));
+	}
+
+	for (i = 0; i<  NR_LRU_LISTS; i++) {
+		unsigned long long val = 0;
+
+		for_each_mem_cgroup_tree(mi, memcg)
+			val += mem_cgroup_nr_lru_pages(mi, BIT(i));
+		printk(KERN_CONT "%s:%lluKB ", mem_cgroup_lru_names[i], K(val));
+	}
+	printk(KERN_CONT "\n");
This is nice and simple I am just thinking whether it is enough. Say
that you have a deeper hierarchy and the there is a safety limit in the
its root
        A (limit)
       /|\
      B C D
          |\
  E F

and we trigger an OOM on the A's limit. Now we know that something blew
up but what it was we do not know. Wouldn't it be better to swap the for
and for_each_mem_cgroup_tree loops? Then we would see the whole
hierarchy and can potentially point at the group which doesn't behave.
Memory cgroup stats for A/: ...
Memory cgroup stats for A/B/: ...
Memory cgroup stats for A/C/: ...
Memory cgroup stats for A/D/: ...
Memory cgroup stats for A/D/E/: ...
Memory cgroup stats for A/D/F/: ...

Would it still fit in with your use case?
[...]
We haven't used those complicate hierarchy yet, but it sounds a good
suggestion. :)
Hierarchy is a little complex to use from our experience, and the
three cgroups involved in memcg oom can be different: memcg of
invoker, killed task, memcg of going over limit.Suppose a process in
B triggers oom and a victim in root A is selected to be killed, we
may as well want to know memcg stats just local in A cgroup(excludes
BCD). So besides hierarchy info, does it acceptable to also print
the local root node stats which as I did in the V1
version(https://lkml.org/lkml/2012/7/30/179).
Ohh, I probably wasn't clear enough. I didn't suggest cumulative
numbers. Only per group. So it would be something like:

	for_each_mem_cgroup_tree(mi, memcg) {
		printk("Memory cgroup stats for %s", memcg_name);
		for (i = 0; i<  MEM_CGROUP_STAT_NSTATS; i++) {
			if (i == MEM_CGROUP_STAT_SWAP&&  !do_swap_account)
				continue;
			printk(KERN_CONT "%s:%lldKB ", mem_cgroup_stat_names[i],
				K(mem_cgroup_read_stat(mi, i)));
		}
		for (i = 0; i<  NR_LRU_LISTS; i++)
			printk(KERN_CONT "%s:%lluKB ", mem_cgroup_lru_names[i],
				K(mem_cgroup_nr_lru_pages(mi, BIT(i))));

		printk(KERN_CONT"\n");
	}
Another one I'm hesitating is numa stats, it seems the output is
beginning to get more and more....
NUMA stats are basically per node - per zone LRU data and that the
for(NR_LRU_LISTS) can be easily extended to cover that.
-- 
Michal Hocko
SUSE Labs
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help