Thread (110 messages) 110 messages, 8 authors, 2011-09-21

Re: [patch 2/8] mm: memcg-aware global reclaim

From: Ying Han <hidden>
Date: 2011-08-31 22:58:21
Also in: lkml

On Tue, Aug 30, 2011 at 8:14 AM, Johannes Weiner [off-list ref] wrote:
On Tue, Aug 30, 2011 at 12:07:07AM -0700, Ying Han wrote:
quoted
On Mon, Aug 29, 2011 at 2:05 PM, Johannes Weiner [off-list ref] wrote:
quoted
On Mon, Aug 29, 2011 at 01:36:48PM -0700, Ying Han wrote:
quoted
On Mon, Aug 29, 2011 at 12:04 PM, Johannes Weiner [off-list ref]wrote:
quoted
On Mon, Aug 29, 2011 at 12:22:02AM -0700, Ying Han wrote:
quoted
quoted
@@ -888,19 +888,21 @@ void mem_cgroup_del_lru_list(struct page *page,
enum lru_list lru)
 {
 >------struct page_cgroup *pc;
 >------struct mem_cgroup_per_zone *mz;
+>------struct mem_cgroup *mem;
·
 >------if (mem_cgroup_disabled())
 >------>-------return;
 >------pc = lookup_page_cgroup(page);
->------/* can happen while we handle swapcache. */
->------if (!TestClearPageCgroupAcctLRU(pc))
->------>-------return;
->------VM_BUG_ON(!pc->mem_cgroup);
->------/*
->------ * We don't check PCG_USED bit. It's cleared when the "page" is
finally
quoted
quoted
->------ * removed from global LRU.
->------ */
->------mz = page_cgroup_zoneinfo(pc->mem_cgroup, page);
+
+>------if (TestClearPageCgroupAcctLRU(pc) || PageCgroupUsed(pc)) {
This PageCgroupUsed part confuses me.  A page that is being isolated
shortly after being charged while on the LRU may reach here, and then
it is unaccounted from pc->mem_cgroup, which it never was accounted
to.

Could you explain why you added it?
To be honest, i don't have very good reason for that. The PageCgroupUsed
check is put there after running some tests and some fixes seems help the
test, including this one.

The one case I can think of for page !AcctLRU | Used is in the pagevec.
However, we shouldn't get to the mem_cgroup_del_lru_list() for a page in
pagevec at the first place.

I now made it so that PageCgroupAcctLRU on the LRU means accounted
to pc->mem_cgroup,

this is the same logic currently.
quoted
and !PageCgroupAcctLRU on the LRU means accounted to
and babysitted by root_mem_cgroup.
this seems to be different from what it is now, especially for swapcache
page. So, the page here is linked to root cgroup LRU or not?

Anyway, the AcctLRU flags still seems confusing to me:

what this flag tells me is that whether or not the page is on a PRIVATE lru
and being accounted, i used private here to differentiate from the per zone
lru, where it also has PageLRU flag.  The two flags are separate since pages
could be on one lru not the other ( I guess ) , but this is changed after
having the root cgroup lru back. For example, AcctLRU is used to keep track
of the accounted lru pages, especially for root ( we didn't account the
!Used pages to root like readahead swapcache). Now we account the full size
of lru list of root including Used and !Used, but only mark the Used pages
w/ AcctLRU flag.

So in general, i am wondering we should be able to replace that eventually
with existing Used and LRU bit.  Sorry this seems to be something we like to
consider later, not necessarily now :)
I have now the following comment in mem_cgroup_lru_del_list():

       /*
        * root_mem_cgroup babysits uncharged LRU pages, but
        * PageCgroupUsed is cleared when the page is about to get
        * freed.  PageCgroupAcctLRU remembers whether the
        * LRU-accounting happened against pc->mem_cgroup or
        * root_mem_cgroup.
        */

Does that answer your question?  If not, please tell me, so I can fix
the comment :-)
Sorry, not clear to me yet :(

Is this saying that we can not differentiate the page linked to root
but not charged vs
page linked to memcg which is about to be freed.

If that is the case, isn't the page being removed from lru first
before doing uncharge (ClearPageCgroupUsed) ?
Sorry for getting back to this late.
It depends.  From the reclaim path, yes.  But it may be freed through
__page_cache_release() for example, which unlinks after uncharge.
That is true. And the comment start making senses to me. Thanks.

The problem here is the inconsistency of the pc->mem_cgroup and
page->lru for uncharged pages ( !Used). And even further, that is
caused by (only?) pages silently floating from memcg lru
to root lru after they are uncharged (before they are freed). And I
wonder those pages will be short lived.

Guess my question is why those pages have to travel to root and then
freed quickly, and we just leave them in the memcg lru?

--Ying
So when we reach mem_cgroup_lru_del(), PageCgroupUsed could be cleared
with the page being lru-accounted to root_mem_cgroup (swap readahead,
swapoff) or
cleared with the page being lru-accounted to a different
memcg (truncate/invalidate, unmap)
quoted
quoted
quoted
quoted
Always.  Which also means that before_commit now ensures an LRU
page is moved to root_mem_cgroup for babysitting during the
charge, so that concurrent isolations/putbacks are always
accounted correctly.  Is this what you had in mind?  Did I miss
something?
In my tree, the before->commit->after protocol is folded into one function.
I didn't post it since I know you also have patch doing that.  So guess I
don't understand why we need to move the page to root while it is gonna be
charged to a memcg by commit_charge shortly after.
It is a consequence of your fix that LRU-accounts unused pages to
root_mem_cgroup upon lru-add, and thus deaccounts !PageCgroupAcctLRU
from root_mem_cgroup unconditionally upon lru-del.

Consider the following scenario:

       1. page with multiple mappings swapped out.

       2. one memcg faults the page, then unmaps it.  The page is
       uncharged, but swap-freeing fails due to the other ptes, and
       the page stays lru-accounted on the memcg it's no longer
       charged to.
I agree that a page could be ending up on a memcg-lru (AcctLRU) but
not charged (!Used). But not sure
if the case above is true or not, since we don't uncharge a page which
marked as SwapCache until the
page is removed from the swapcache.
Blergh, you are right.  I missed the PageSwapCache() check in
__mem_cgroup_uncharge_common().  That looks pretty misplaced up there,
btw, I see whether it can be moved.
quoted
One case which we might change the owner of a page while it is linked
on lru is calling reuse_swap_page() under write fault, so the page is
uncharged after removing from
swapcache while linked in the old memcg lru. It will be adjust by
commit_charge_swapin() later.
Yes, this scenario has this window where PageCgroupAcctLRU is cleared
in before_commit and reclaim could race and isolate the page,
unaccounting it from root_mem_cgroup which it was never charged to.
quoted
quoted
       3. another memcg faults the page.  before_commit must
       lru-unaccount from pc->mem_cgroup before pc->mem_cgroup is
       overwritten.

       4. the page is charged.  after_commit does the fixup.

Between 3. and 4., a reclaimer can isolate the page.  The old
lru-accounting is undone and mem_cgroup_lru_del() does this:

       if (TestClearPageCgroupAcctLRU(pc)) {
               VM_BUG_ON(!pc->mem_cgroup);
               mem = pc->mem_cgroup;
       } else
               mem = root_mem_cgroup;
      mz = page_cgroup_zoneinfo(mem, page);
       /* huge page split is done under lru_lock. so, we have no races. */
       MEM_CGROUP_ZSTAT(mz, lru) -= 1 << compound_order(page);

The rule is that !PageCgroupAcctLRU means that the page is
lru-accounted to root_mem_cgroup.  So when charging, the page has to
be moved to root_mem_cgroup until a new memcg is responsible for it.
So here we are saying that isolating a page which has be
mem_cgroup_lru_del().  Isn't the later one does lru-unaccount and also
list_del(), so is that possible to isolate a page not on lru. Or is
this caused by not clearing the LRU bit in before_commit?
mem_cgroup_lru_del() does not do list_del() anymore.  It's just about
accounting and, in the add case, returning the proper lruvec.

Calling it on a page not on the LRU is a bug.
quoted
quoted
quoted
My understanding is that in before_commit, we uncharge the page from
previous memcg lru if AcctLRU was set, then in the commit_charge we update
the new owner of it. And in after_commit we update the memcg lru for the new
owner after linking the page in the lru.
Exactly, just that between unaccounting from the old and accounting to
the new, someone else may look at the page and has to find it in a
sensible state.
Wonder if clearing the PageLRU after before_commit is helpful here.
How would after_commit detect whether the page needs relinking or not?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help