Thread (46 messages) 46 messages, 8 authors, 2007-08-26

Re: [RFC 0/7] Postphone reclaim laundry to write at high water marks

From: Christoph Lameter <hidden>
Date: 2007-08-22 19:19:25
Also in: lkml

On Wed, 22 Aug 2007, Ingo Molnar wrote:
Could you outline the "big picture" as you see it? To me your argument 
that reclaim can always be done instantly and that the cases where it 
cannot be done are pathological and need to be avoided is fundamentally 
dangerous and quite a bit short-sighted at first glance.
That is a bit overdrawing my argument. The issues that Peter saw can be 
fixed by allowing recursive reclaim (see the earlier patchset). The rest 
is so far sugar on top or building extreme cases where we already have 
trouble.
The big picture way to think about this is the following: the free page 
pool is the "cache" of the MM. It's what "greases" the mechanism and 
bridges the inevitable reclaim latency and makes "atomic memory" 
available to the reclaim mechanism itself. We _cannot_ remove that cache 
without a conceptual replacement (or a _very_ robust argument and proof 
that the free pages pool is not needed at all - this would be a major 
design change (and a stupid mistake IMO)). Your patchset, in essence, 
tries to claim that we dont really need this cache and that all that 
matters is to keep enough clean pagecache pages around. That approach 
misses the full picture and i dont think we can progress without 
agreeing on the fundamentals first.
The patchset attempts to deal with the reserves in a more intelligent 
way in order not to fail when this pool becomes exhausted because some
device needs a lot of memory in the writeout path.
That "cache" cannot be handled in your scheme: a fully or mostly 
anonymous workload (tons of apps are like that) instantly destroys the 
"there is always a minimal amount of atomically reclaimable pages 
around" property of freelists, and this cannot be talked or tweaked 
around by twiddling any existing property of anonymous reclaim. 
A extreme anonymous workload like discussed here can even cause the 
current VM to fail. Realistically at least portions of the executable and 
varios slab caches will remain in memory in addition to the reserves.
Anonymous memory is dirty and takes ages to reclaim. The fact that your 
patchset causes an easy anonymous OOM further underlines this flaw of 
your thinking. Not making anonymous workloads OOM is the _hardest_ part 
of the MM, by far. Pagecache reclaim is a breeze in comparison :-)
The central flaw in my thinking was the switching of of PF_MEMALLOC on the 
writeout path instead of allowing recursive PF_MEMALLOC reclaim as in the 
first patch. But the first patchset did not have that flaw.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help