Re: [patch 2/8] mm: memcg-aware global reclaim
From: Ying Han <hidden>
Date: 2011-08-29 20:36:58
On Mon, Aug 29, 2011 at 12:04 PM, Johannes Weiner [off-list ref]wrote:
On Mon, Aug 29, 2011 at 12:22:02AM -0700, Ying Han wrote:quoted
On Mon, Aug 29, 2011 at 12:15 AM, Ying Han [off-list ref] wrote:quoted
On Thu, Aug 11, 2011 at 2:09 PM, Johannes Weiner [off-list ref]wrote:quoted
quoted
quoted
On Thu, Aug 11, 2011 at 01:39:45PM -0700, Ying Han wrote:quoted
Please consider including the following patch for the next post. Itcausesquoted
quoted
quoted
quoted
crash on some of the tests where sc->mem_cgroup is NULL (globalkswapd).quoted
quoted
quoted
quoted
diff --git a/mm/vmscan.c b/mm/vmscan.c index b72a844..12ab25d 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c@@ -2768,7 +2768,8 @@ loop_again: * Do some background aging of the anonlist, toquoted
quoted
quoted
quoted
give * pages a chance to be referenced before reclaiming. */ - if (inactive_anon_is_low(zone, &sc)) + if (scanning_global_lru(&sc) && + inactive_anon_is_low(zone,&sc))quoted
quoted
quoted
quoted
shrink_active_list(SWAP_CLUSTER_MAX,zone,quoted
quoted
quoted
quoted
&sc,priority, 0);quoted
quoted
quoted
Thanks! I completely overlooked this one and only noticed it after changing the arguments to shrink_active_list(). On memcg configurations, scanning_global_lru() will essentially never be true again, so I moved the anon pre-aging to a separate function that also does a hierarchy loop to preage the per-memcg anon lists. I hope to send out the next revision soon.Also, please consider to fold in the following patch as well. It fixes the root cgroup lru accounting and we could easily trigger OOM while doing some swapoff test w/o it. mm:fix the lru accounting for root cgroup. This patch is applied on top of: " mm: memcg-aware global reclaim Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> " This patch fixes the lru accounting for root cgroup. After the "memcg-aware global reclaim" patch, one of the changes is tohavequoted
quoted
lru pages linked back to root. Under the global memory pressure, westart fromquoted
quoted
the root cgroup lru and walk through the memcg hierarchy of the system.Forquoted
quoted
each memcg, we reclaim pages based on the its lru size. However for root cgroup, we used not having a seperate lru and onlycountingquoted
quoted
the pages charged to root as part of root lru size. Without this patch,allquoted
quoted
the pages which are linked to root lru but not charged to root likeswapcachequoted
quoted
readahead are not visible to page reclaim code and we are easily to getOOM.quoted
quoted
After this patch, all the pages linked under root lru are counted inthe lruquoted
quoted
size, including Used and !Used. Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Ying Han <redacted>diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 5518f54..f6c5f29 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c@@ -888,19 +888,21 @@ void mem_cgroup_del_lru_list(struct page *page,enum lru_list lru) { >------struct page_cgroup *pc; >------struct mem_cgroup_per_zone *mz; +>------struct mem_cgroup *mem; · >------if (mem_cgroup_disabled()) >------>-------return; >------pc = lookup_page_cgroup(page); ->------/* can happen while we handle swapcache. */ ->------if (!TestClearPageCgroupAcctLRU(pc)) ->------>-------return; ->------VM_BUG_ON(!pc->mem_cgroup); ->------/* ->------ * We don't check PCG_USED bit. It's cleared when the "page" isfinallyquoted
quoted
->------ * removed from global LRU. ->------ */ ->------mz = page_cgroup_zoneinfo(pc->mem_cgroup, page); + +>------if (TestClearPageCgroupAcctLRU(pc) || PageCgroupUsed(pc)) {This PageCgroupUsed part confuses me. A page that is being isolated shortly after being charged while on the LRU may reach here, and then it is unaccounted from pc->mem_cgroup, which it never was accounted to. Could you explain why you added it?
To be honest, i don't have very good reason for that. The PageCgroupUsed check is put there after running some tests and some fixes seems help the test, including this one. The one case I can think of for page !AcctLRU | Used is in the pagevec. However, we shouldn't get to the mem_cgroup_del_lru_list() for a page in pagevec at the first place. I now made it so that PageCgroupAcctLRU on the LRU means accounted to
pc->mem_cgroup,
this is the same logic currently.
and !PageCgroupAcctLRU on the LRU means accounted to and babysitted by root_mem_cgroup.
this seems to be different from what it is now, especially for swapcache page. So, the page here is linked to root cgroup LRU or not? Anyway, the AcctLRU flags still seems confusing to me: what this flag tells me is that whether or not the page is on a PRIVATE lru and being accounted, i used private here to differentiate from the per zone lru, where it also has PageLRU flag. The two flags are separate since pages could be on one lru not the other ( I guess ) , but this is changed after having the root cgroup lru back. For example, AcctLRU is used to keep track of the accounted lru pages, especially for root ( we didn't account the !Used pages to root like readahead swapcache). Now we account the full size of lru list of root including Used and !Used, but only mark the Used pages w/ AcctLRU flag. So in general, i am wondering we should be able to replace that eventually with existing Used and LRU bit. Sorry this seems to be something we like to consider later, not necessarily now :) Always. Which also means that
before_commit now ensures an LRU page is moved to root_mem_cgroup for babysitting during the charge, so that concurrent isolations/putbacks are always accounted correctly. Is this what you had in mind? Did I miss something?
In my tree, the before->commit->after protocol is folded into one function. I didn't post it since I know you also have patch doing that. So guess I don't understand why we need to move the page to root while it is gonna be charged to a memcg by commit_charge shortly after. My understanding is that in before_commit, we uncharge the page from previous memcg lru if AcctLRU was set, then in the commit_charge we update the new owner of it. And in after_commit we update the memcg lru for the new owner after linking the page in the lru. --Ying