Re: [PATCH V7 2/2] mm: memcg detect no memcgs above softlimit under zone reclaim

From: Michal Hocko <hidden>
Date: 2012-08-06 14:03:56

On Wed 01-08-12 16:10:32, Rik van Riel wrote:

On 08/01/2012 03:04 PM, Ying Han wrote:

quoted

That is true. Hmm, then two things i can do:

1. for kswapd case, make sure not counting the root cgroup
2. or check nr_scanned. I like the nr_scanned which is telling us
whether or not the reclaim ever make any attempt ?

I am looking at a more advanced case of (3) right
now.  Once I have the basics working, I will send
you a prototype (that applies on top of your patches)
to play with.

Basically, for every LRU in the system, we can keep
track of 4 things:
- reclaim_stat->recent_scanned
- reclaim_stat->recent_rotated
- reclaim_stat->recent_pressure
- LRU size

The first two represent the fraction of pages on the
list that are actively used.  The larger the fraction
of recently used pages, the more valuable the cache
is. The inverse of that can be used to show us how
hard to reclaim this cache, compared to other caches
(everything else being equal).

The recent pressure can be used to keep track of how
many pages we have scanned on each LRU list recently.
Pressure is scaled with LRU size.

This would be the basic formula to decide which LRU
to reclaim from:

          recent_scanned   LRU size
score =   -------------- * ----------------
          recent_rotated   recent_pressure


In other words, the less the objects on an LRU are
used, the more we should reclaim from that LRU. The
larger an LRU is, the more we should reclaim from
that LRU.

The formula makes sense but I am afraid that it will be hard to tune it
into something that wouldn't regress. For example I have seen workloads
which had many small groups which are used to wrap up backup jobs and
those are scanned a lot, you would see also many rotations because of
the writeback but those are definitely good to scan rather than a large
group which needs to keep its data resident.
Anyway, I am not saying the score approach is a bad idea but I am afraid
it will be hard to validate and make it right.

The more we have already scanned an LRU, the lower
its score becomes. At some point, another LRU will
have the top score, and that will be the target to
scan.

So you think we shouldn't do the full round over memcgs in shrink_zone a
and rather do it oom way to pick up a victim and hammer it?

We can adjust the score for different LRUs in different
ways, eg.:
- swappiness adjustment for file vs anon LRUs, within
  an LRU set
- if an LRU set contains a file LRU with more inactive
  than active pages, reclaim from this LRU set first
- if an LRU set is over it's soft limit, reclaim from
  this LRU set first

maybe we could replace LRU size by (LRU size - soft_limit) in the above
formula?

This also gives us a nice way to balance memory pressure
between zones, etc...

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help