Thread (54 messages) 54 messages, 11 authors, 2011-05-20

Re: [PATCH 0/4] Reduce impact to overall system of SLUB using high-order allocations V2

From: Mel Gorman <mgorman@suse.de>
Date: 2011-05-13 15:43:22
Also in: linux-fsdevel, linux-mm, lkml

On Fri, May 13, 2011 at 10:21:46AM -0500, Christoph Lameter wrote:
On Fri, 13 May 2011, Mel Gorman wrote:
quoted
SLUB using high orders is the trigger but not the root cause as SLUB
has been using high orders for a while. The following four patches
aim to fix the problems in reclaim while reducing the cost for SLUB
using those high orders.

Patch 1 corrects logic introduced by commit [1741c877: mm:
	kswapd: keep kswapd awake for high-order allocations until
	a percentage of the node is balanced] to allow kswapd to
	go to sleep when balanced for high orders.
The above looks good.
Ok.
quoted
Patch 2 prevents kswapd waking up in response to SLUBs speculative
	use of high orders.
Not sure if that is necessary since it seems that we triggered kswapd
before? Why not continue to do it? Once kswapd has enough higher order
pages kswapd should no longer be triggered right?
Because kswapd waking up isn't cheap and we are reclaiming pages
just so SLUB may get high-order pages in the future. As it's for
PAGE_ORDER_COSTLY_ORDER, we are not entering lumpy reclaim and just
selecting a few random order-0 pages which may or may not help. There
is very little control of how many pages are getting freed if kswapd
is being woken frequently.
quoted
Patch 3 further reduces the cost by prevent SLUB entering direct
	compaction or reclaim paths on the grounds that falling
	back to order-0 should be cheaper.
Its cheaper for reclaim path true but more expensive in terms of SLUBs
management costs of the data and it also increases the memory wasted.
Surely the reclaim cost exceeds SLUB management cost?
A
higher order means denser packing of objects less page management
overhead. Fallback is not for free.
Neither is reclaiming a large bunch of pages. Worse, reclaiming
pages so SLUB gets a high-order means it's likely to be stealing
MIGRATE_MOVABLE blocks which eventually gives diminishing returns but
may not be noticeable for weeks. From a fragmentation perspective,
it's better if SLUB uses order-0 allocations when memory is low so
that SLUB pages continue to get packed into as few MIGRATE_UNMOVABLE
and MIGRATE_UNRECLAIMABLE blocks as possible.
 Reasonable effort should be made to
allocate the page order requested.
quoted
Patch 4 notes that even when kswapd is failing to keep up with
	allocation requests, it should still go to sleep when its
	quota has expired to prevent it spinning.
Looks good too.

Overall, it looks like the compaction logic and the modifications to
reclaim introduced recently with the intend to increase the amount of
physically contiguous memory is not working as expected.
The reclaim and kswapd damage was unintended and this is my fault
but reclaim/compaction still makes a lot more sense than lumpy
reclaim. Testing showed it disrupted the system a lot less and
allocated high-order pages faster with fewer pages reclaimed.
SLUBs chance of getting higher order pages should be *increasing* as a
result of these changes. The above looks like the chances are decreasing
now.
Patches 2 and 3 may mean that SLUB gets fewer high order pages when
memory is low and it's depending on high-order pages to be naturally
freed by SLUB as it recycles slabs of old objects. On the flip-side,
fewer pages will be reclaimed. I'd expect the latter option is
cheaper overall.
This is a matter of future concern. The metadata management overhead
in the kernel is continually increasing since memory sizes keep growing
and we typically manage memory in 4k chunks. Through large allocation
sizes we can reduce that management overhead but we can only do this if we
have an effective way of defragmenting memory to get longer contiguous
chunks that can be managed to a single page struct.

Please make sure that compaction and related measures really work properly.
Local testing still shows them to be behaving as expected but then
again, I haven't reproduced the simple problem reported by Chris
and James despite using a few different laptops and two different
low-end servers.
The patches suggest that the recent modifications are not improving the
situation.
-- 
Mel Gorman
SUSE Labs
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help