Thread (67 messages) 67 messages, 6 authors, 2016-06-24

Re: [PATCH 10/10] mm: balance LRU lists based on relative thrashing

From: Johannes Weiner <hannes@cmpxchg.org>
Date: 2016-06-22 21:59:29
Also in: lkml

On Mon, Jun 20, 2016 at 04:42:08PM +0900, Minchan Kim wrote:
On Fri, Jun 17, 2016 at 01:01:29PM -0400, Johannes Weiner wrote:
quoted
On Fri, Jun 17, 2016 at 04:49:45PM +0900, Minchan Kim wrote:
quoted
On Thu, Jun 16, 2016 at 11:12:07AM -0400, Johannes Weiner wrote:
quoted
On Wed, Jun 15, 2016 at 11:23:41AM +0900, Minchan Kim wrote:
quoted
Do we want to retain [1]?

This patch motivates from swap IO could be much faster than file IO
so that it would be natural if we rely on refaulting feedback rather
than forcing evicting file cache?

[1] e9868505987a, mm,vmscan: only evict file pages when we have plenty?
Yes! We don't want to go after the workingset, whether it be cache or
anonymous, while there is single-use page cache lying around that we
can reclaim for free, with no IO and little risk of future IO. Anon
memory doesn't have this equivalent. Only cache is lazy-reclaimed.

Once the cache refaults, we activate it to reflect the fact that it's
workingset. Only when we run out of single-use cache do we want to
reclaim multi-use pages, and *then* we balance workingsets based on
cost of refetching each side from secondary storage.
If pages in inactive file LRU are really single-use page cache, I agree.

However, how does the logic can work like that?
If reclaimed file pages were part of workingset(i.e., refault happens),
we give the pressure to anonymous LRU but get_scan_count still force to
reclaim file lru until inactive file LRU list size is enough low.

With that, too many file workingset could be evicted although anon swap
is cheaper on fast swap storage?

IOW, refault mechanisme works once inactive file LRU list size is enough
small but small inactive file LRU doesn't guarantee it has only multiple
-use pages. Hm, Isn't it a problem?
It's a trade-off between the cost of detecting a new workingset from a
stream of use-once pages, and the cost of use-once pages impose on the
established workingset.

That's a pretty easy choice, if you ask me. I'd rather ask cache pages
to prove they are multi-use than have use-once pages put pressure on
the workingset.
Make sense.
quoted
Sure, a spike like you describe is certainly possible, where a good
portion of the inactive file pages will be re-used in the near future,
yet we evict all of them in a burst of memory pressure when we should
have swapped. That's a worst case scenario for the use-once policy in
a workingset transition.
So, the point is how such case it happens frequently. A scenario I can
think of is that if we use one-cgroup-per-app, many file pages would be
inactive LRU while active LRU is almost empty until reclaim kicks in.
Because normally, parallel reclaim work during launching new app makes
app's startup time really slow. That's why mobile platform uses notifiers
to get free memory in advance via kiling/reclaiming. Anyway, once we get
amount of free memory and lauching new app in a new cgroup, pages would
live his born LRU list(ie, anon: active file: inactive) without aging.

Then, activity manager can set memory.high of less important app-cgroup
to reclaim it with high value swappiness because swap device is much
faster on that system and much bigger anonymous pages compared to file-
backed pages. Surely, activity manager will expect lots of anonymous
pages be able to swap out but unlike expectation, he will see such spike
easily with reclaiming file-backed pages a lot and refault until inactive
file LRU is enough small.

I think it's enough possible scenario in small system one-cgroup-per-
app.
That's the workingset transition I was talking about. The algorithm is
designed to settle towards stable memory patterns. We can't possibly
remove one of the key components of this - the use-once policy - to
speed up a few seconds of workingset transition when it comes at the
risk of potentially thrashing the workingset for *hours*.

The fact that swap IO can be faster than filesystem IO doesn't change
this at all. The point is that the reclaim and refetch IO cost of
use-once cache is ZERO. Causing swap IO to make room for more and more
unused cache pages doesn't make any sense, no matter the swap speed.

I really don't see the relevance of this discussion to this patch set.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help