Re: [PATCH V7 2/2] mm: memcg detect no memcgs above softlimit under zone reclaim
From: Rik van Riel <hidden>
Date: 2012-08-06 14:27:41
On 08/06/2012 10:03 AM, Michal Hocko wrote:
On Wed 01-08-12 16:10:32, Rik van Riel wrote:quoted
On 08/01/2012 03:04 PM, Ying Han wrote:quoted
That is true. Hmm, then two things i can do: 1. for kswapd case, make sure not counting the root cgroup 2. or check nr_scanned. I like the nr_scanned which is telling us whether or not the reclaim ever make any attempt ?I am looking at a more advanced case of (3) right now. Once I have the basics working, I will send you a prototype (that applies on top of your patches) to play with. Basically, for every LRU in the system, we can keep track of 4 things: - reclaim_stat->recent_scanned - reclaim_stat->recent_rotated - reclaim_stat->recent_pressure - LRU size The first two represent the fraction of pages on the list that are actively used. The larger the fraction of recently used pages, the more valuable the cache is. The inverse of that can be used to show us how hard to reclaim this cache, compared to other caches (everything else being equal). The recent pressure can be used to keep track of how many pages we have scanned on each LRU list recently. Pressure is scaled with LRU size. This would be the basic formula to decide which LRU to reclaim from: recent_scanned LRU size score = -------------- * ---------------- recent_rotated recent_pressure In other words, the less the objects on an LRU are used, the more we should reclaim from that LRU. The larger an LRU is, the more we should reclaim from that LRU.The formula makes sense but I am afraid that it will be hard to tune it into something that wouldn't regress. For example I have seen workloads which had many small groups which are used to wrap up backup jobs and those are scanned a lot, you would see also many rotations because of the writeback but those are definitely good to scan rather than a large group which needs to keep its data resident.
Writeback rotations are not counted in lruvec->reclaim_stat->recent_rotated - only the rotations that were done because we really want to keep the page are counted.
Anyway, I am not saying the score approach is a bad idea but I am afraid it will be hard to validate and make it right.
One thing about the recent_scanned / recent_rotated metric is that we have been using it since 2.6.28, to balance between scanning the file and anonymous LRUs. I believe it would help us balance between multiple sets of LRUs, too.
quoted
The more we have already scanned an LRU, the lower its score becomes. At some point, another LRU will have the top score, and that will be the target to scan.So you think we shouldn't do the full round over memcgs in shrink_zone a and rather do it oom way to pick up a victim and hammer it?
Not hammer it too far. Only until its score ends up well below (25% lower?) than that of the second highest scoring list. That way all the lists get hammered a little bit, in turn.
quoted
We can adjust the score for different LRUs in different ways, eg.: - swappiness adjustment for file vs anon LRUs, within an LRU set - if an LRU set contains a file LRU with more inactive than active pages, reclaim from this LRU set first - if an LRU set is over it's soft limit, reclaim from this LRU set firstmaybe we could replace LRU size by (LRU size - soft_limit) in the above formula?
Good idea, that could work. -- All rights reversed -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>