Thread (110 messages) 110 messages, 8 authors, 2011-09-21

Re: [patch 2/8] mm: memcg-aware global reclaim

From: Ying Han <hidden>
Date: 2011-08-29 20:36:58

On Mon, Aug 29, 2011 at 12:04 PM, Johannes Weiner [off-list ref]wrote:
On Mon, Aug 29, 2011 at 12:22:02AM -0700, Ying Han wrote:
quoted
On Mon, Aug 29, 2011 at 12:15 AM, Ying Han [off-list ref] wrote:
quoted
On Thu, Aug 11, 2011 at 2:09 PM, Johannes Weiner [off-list ref]
wrote:
quoted
quoted
quoted
On Thu, Aug 11, 2011 at 01:39:45PM -0700, Ying Han wrote:
quoted
Please consider including the following patch for the next post. It
causes
quoted
quoted
quoted
quoted
crash on some of the tests where sc->mem_cgroup is NULL (global
kswapd).
quoted
quoted
quoted
quoted
diff --git a/mm/vmscan.c b/mm/vmscan.c
index b72a844..12ab25d 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2768,7 +2768,8 @@ loop_again:
                         * Do some background aging of the anon
list, to
quoted
quoted
quoted
quoted
give
                         * pages a chance to be referenced before
reclaiming.
                         */
-                       if (inactive_anon_is_low(zone, &sc))
+                       if (scanning_global_lru(&sc) &&
+                                       inactive_anon_is_low(zone,
&sc))
quoted
quoted
quoted
quoted
                                shrink_active_list(SWAP_CLUSTER_MAX,
zone,
quoted
quoted
quoted
quoted
                                                        &sc,
priority, 0);
quoted
quoted
quoted
Thanks!  I completely overlooked this one and only noticed it after
changing the arguments to shrink_active_list().

On memcg configurations, scanning_global_lru() will essentially never
be true again, so I moved the anon pre-aging to a separate function
that also does a hierarchy loop to preage the per-memcg anon lists.

I hope to send out the next revision soon.
Also, please consider to fold in the following patch as well. It fixes
the root cgroup lru accounting and we could easily trigger OOM while
doing some swapoff test w/o it.

mm:fix the lru accounting for root cgroup.

This patch is applied on top of:
"
mm: memcg-aware global reclaim
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
"

This patch fixes the lru accounting for root cgroup.

After the "memcg-aware global reclaim" patch, one of the changes is to
have
quoted
quoted
lru pages linked back to root. Under the global memory pressure, we
start from
quoted
quoted
the root cgroup lru and walk through the memcg hierarchy of the system.
For
quoted
quoted
each memcg, we reclaim pages based on the its lru size.

However for root cgroup, we used not having a seperate lru and only
counting
quoted
quoted
the pages charged to root as part of root lru size. Without this patch,
all
quoted
quoted
the pages which are linked to root lru but not charged to root like
swapcache
quoted
quoted
readahead are not visible to page reclaim code and we are easily to get
OOM.
quoted
quoted
After this patch, all the pages linked under root lru are counted in
the lru
quoted
quoted
size, including Used and !Used.

Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Ying Han <redacted>
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 5518f54..f6c5f29 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -888,19 +888,21 @@ void mem_cgroup_del_lru_list(struct page *page,
enum lru_list lru)
 {
 >------struct page_cgroup *pc;
 >------struct mem_cgroup_per_zone *mz;
+>------struct mem_cgroup *mem;
·
 >------if (mem_cgroup_disabled())
 >------>-------return;
 >------pc = lookup_page_cgroup(page);
->------/* can happen while we handle swapcache. */
->------if (!TestClearPageCgroupAcctLRU(pc))
->------>-------return;
->------VM_BUG_ON(!pc->mem_cgroup);
->------/*
->------ * We don't check PCG_USED bit. It's cleared when the "page" is
finally
quoted
quoted
->------ * removed from global LRU.
->------ */
->------mz = page_cgroup_zoneinfo(pc->mem_cgroup, page);
+
+>------if (TestClearPageCgroupAcctLRU(pc) || PageCgroupUsed(pc)) {
This PageCgroupUsed part confuses me.  A page that is being isolated
shortly after being charged while on the LRU may reach here, and then
it is unaccounted from pc->mem_cgroup, which it never was accounted
to.

Could you explain why you added it?
To be honest, i don't have very good reason for that. The PageCgroupUsed
check is put there after running some tests and some fixes seems help the
test, including this one.

The one case I can think of for page !AcctLRU | Used is in the pagevec.
However, we shouldn't get to the mem_cgroup_del_lru_list() for a page in
pagevec at the first place.

I now made it so that PageCgroupAcctLRU on the LRU means accounted to
pc->mem_cgroup,

this is the same logic currently.

and !PageCgroupAcctLRU on the LRU means accounted to
and babysitted by root_mem_cgroup.

this seems to be different from what it is now, especially for swapcache
page. So, the page here is linked to root cgroup LRU or not?

Anyway, the AcctLRU flags still seems confusing to me:

what this flag tells me is that whether or not the page is on a PRIVATE lru
and being accounted, i used private here to differentiate from the per zone
lru, where it also has PageLRU flag.  The two flags are separate since pages
could be on one lru not the other ( I guess ) , but this is changed after
having the root cgroup lru back. For example, AcctLRU is used to keep track
of the accounted lru pages, especially for root ( we didn't account the
!Used pages to root like readahead swapcache). Now we account the full size
of lru list of root including Used and !Used, but only mark the Used pages
w/ AcctLRU flag.

So in general, i am wondering we should be able to replace that eventually
with existing Used and LRU bit.  Sorry this seems to be something we like to
consider later, not necessarily now :)

Always.  Which also means that
before_commit now ensures an LRU page is moved to root_mem_cgroup for
babysitting during the charge, so that concurrent isolations/putbacks
are always accounted correctly.  Is this what you had in mind?  Did I
miss something?
In my tree, the before->commit->after protocol is folded into one function.
I didn't post it since I know you also have patch doing that.  So guess I
don't understand why we need to move the page to root while it is gonna be
charged to a memcg by commit_charge shortly after.

My understanding is that in before_commit, we uncharge the page from
previous memcg lru if AcctLRU was set, then in the commit_charge we update
the new owner of it. And in after_commit we update the memcg lru for the new
owner after linking the page in the lru.

--Ying
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help