Thread (27 messages) 27 messages, 3 authors, 2011-09-01

Re: [patch] Revert "memcg: add memory.vmscan_stat"

From: Ying Han <hidden>
Date: 2011-09-01 07:04:30
Also in: lkml

On Wed, Aug 31, 2011 at 11:40 PM, Johannes Weiner [off-list ref] wrote:
On Wed, Aug 31, 2011 at 11:05:51PM -0700, Ying Han wrote:
quoted
On Tue, Aug 30, 2011 at 1:42 AM, Johannes Weiner [off-list ref] wrote:
quoted
You want to look at A and see whether its limit was responsible for
reclaim scans in any children.  IMO, that is asking the question
backwards.  Instead, there is a cgroup under reclaim and one wants to
find out the cause for that.  Not the other way round.

In my original proposal I suggested differentiating reclaim caused by
internal pressure (due to own limit) and reclaim caused by
external/hierarchical pressure (due to limits from parents).

If you want to find out why C is under reclaim, look at its reclaim
statistics.  If the _limit numbers are high, C's limit is the problem.
If the _hierarchical numbers are high, the problem is B, A, or
physical memory, so you check B for _limit and _hierarchical as well,
then move on to A.

Implementing this would be as easy as passing not only the memcg to
scan (victim) to the reclaim code, but also the memcg /causing/ the
reclaim (root_mem):

       root_mem == victim -> account to victim as _limit
       root_mem != victim -> account to victim as _hierarchical

This would make things much simpler and more natural, both the code
and the way of tracking down a problem, IMO.
This is pretty much the stats I am currently using for debugging the
reclaim patches. For example:

scanned_pages_by_system 0
scanned_pages_by_system_under_hierarchy 50989

scanned_pages_by_limit 0
scanned_pages_by_limit_under_hierarchy 0

"_system" is count under global reclaim, and "_limit" is count under
per-memcg reclaim.
"_under_hiearchy" is set if memcg is not the one triggering pressure.
I don't get this distinction between _system and _limit.  How is it
orthogonal to _limit vs. _hierarchy, i.e. internal vs. external?
Something like :

+enum mem_cgroup_scan_context {
+       SCAN_BY_SYSTEM,
+       SCAN_BY_SYSTEM_UNDER_HIERARCHY,
+       SCAN_BY_LIMIT,
+       SCAN_BY_LIMIT_UNDER_HIERARCHY,
+       NR_SCAN_CONTEXT,
+};

if (global_reclaim(sc))
   context = scan_by_system
else
   context = scan_by_limit

if (target != mem)
   context++;
If the system scans memcgs then no limit is at fault.  It's just
external pressure.

For example, what is the distinction between scanned_pages_by_system
and scanned_pages_by_system_under_hierarchy?
you are right about this, there is no much difference on these since
it is counting global reclaim and everyone
is under_hierarchy except root_cgroup. For root cgroup, it is counted
in "_system". (internal)

The reason for scanned_pages_by_system would be, per your definition,
neither due to
the limit (_by_system -> global reclaim) nor not due to the limit
(!_under_hierarchy -> memcg is the one triggering pressure)
This value "scanned_pages_by_system" only making senses for root
cgroup, which now could be counted as "# of pages scanned in root lru
under global reclaim".

--Ying

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help