Re: [PATCH 4/4] mm: vmscan: If kswapd has been running too long, allow it to sleep

[PATCH 0/4] Reduce impact to overall system of SLUB using high-order allocations V2 · Mel Gorman <mgorman@suse.de> · 2011-05-13
[PATCH 1/4] mm: vmscan: Correct use of pgdat_balanced in sleeping_prematurely · Mel Gorman <mgorman@suse.de> · 2011-05-13
Re: [PATCH 1/4] mm: vmscan: Correct use of pgdat_balanced in sleeping_prematurely · Johannes Weiner <hannes@cmpxchg.org> · 2011-05-13
Re: [PATCH 1/4] mm: vmscan: Correct use of pgdat_balanced in sleeping_prematurely · Minchan Kim <hidden> · 2011-05-14
Re: [PATCH 1/4] mm: vmscan: Correct use of pgdat_balanced in sleeping_prematurely · Rik van Riel <hidden> · 2011-05-16
[PATCH 2/4] mm: slub: Do not wake kswapd for SLUBs speculative high-order allocations · Mel Gorman <mgorman@suse.de> · 2011-05-13
Re: [PATCH 2/4] mm: slub: Do not wake kswapd for SLUBs speculative high-order allocations · David Rientjes <rientjes@google.com> · 2011-05-16
Re: [PATCH 2/4] mm: slub: Do not wake kswapd for SLUBs speculative high-order allocations · Pekka Enberg <hidden> · 2011-05-18
Re: [PATCH 2/4] mm: slub: Do not wake kswapd for SLUBs speculative high-order allocations · Christoph Lameter <hidden> · 2011-05-18
[PATCH 3/4] mm: slub: Do not take expensive steps for SLUBs speculative high-order allocations · Mel Gorman <mgorman@suse.de> · 2011-05-13
Re: [PATCH 3/4] mm: slub: Do not take expensive steps for SLUBs speculative high-order allocations · David Rientjes <rientjes@google.com> · 2011-05-16
Re: [PATCH 3/4] mm: slub: Do not take expensive steps for SLUBs speculative high-order allocations · Mel Gorman <mgorman@suse.de> · 2011-05-17
Re: [PATCH 3/4] mm: slub: Do not take expensive steps for SLUBs speculative high-order allocations · Christoph Lameter <hidden> · 2011-05-17
Re: [PATCH 3/4] mm: slub: Do not take expensive steps for SLUBs speculative high-order allocations · Mel Gorman <mgorman@suse.de> · 2011-05-17
Re: [PATCH 3/4] mm: slub: Do not take expensive steps for SLUBs speculative high-order allocations · Christoph Lameter <hidden> · 2011-05-17
Re: [PATCH 3/4] mm: slub: Do not take expensive steps for SLUBs speculative high-order allocations · David Rientjes <rientjes@google.com> · 2011-05-17
Re: [PATCH 3/4] mm: slub: Do not take expensive steps for SLUBs speculative high-order allocations · David Rientjes <rientjes@google.com> · 2011-05-17
[PATCH 4/4] mm: vmscan: If kswapd has been running too long, allow it to sleep · Mel Gorman <mgorman@suse.de> · 2011-05-13
Re: [PATCH 4/4] mm: vmscan: If kswapd has been running too long, allow it to sleep · KOSAKI Motohiro <hidden> · 2011-05-15
Re: [PATCH 4/4] mm: vmscan: If kswapd has been running too long, allow it to sleep · James Bottomley <James.Bottomley@HansenPartnership.com> · 2011-05-16
Re: [PATCH 4/4] mm: vmscan: If kswapd has been running too long, allow it to sleep · Minchan Kim <hidden> · 2011-05-16
Re: [PATCH 4/4] mm: vmscan: If kswapd has been running too long, allow it to sleep · Mel Gorman <mgorman@suse.de> · 2011-05-16
Re: [PATCH 4/4] mm: vmscan: If kswapd has been running too long, allow it to sleep · Minchan Kim <hidden> · 2011-05-16
Re: [PATCH 4/4] mm: vmscan: If kswapd has been running too long, allow it to sleep · Mel Gorman <mgorman@suse.de> · 2011-05-16
Re: [PATCH 4/4] mm: vmscan: If kswapd has been running too long, allow it to sleep · Minchan Kim <hidden> · 2011-05-16
Re: [PATCH 4/4] mm: vmscan: If kswapd has been running too long, allow it to sleep · Minchan Kim <hidden> · 2011-05-17
Re: [PATCH 4/4] mm: vmscan: If kswapd has been running too long, allow it to sleep · Mel Gorman <mgorman@suse.de> · 2011-05-17
Re: [PATCH 4/4] mm: vmscan: If kswapd has been running too long, allow it to sleep · Colin Ian King <hidden> · 2011-05-17
[PATCH] mm: vmscan: Correctly check if reclaimer should schedule during shrink_slab · Mel Gorman <mgorman@suse.de> · 2011-05-17
Re: [PATCH] mm: vmscan: Correctly check if reclaimer should schedule during shrink_slab · KOSAKI Motohiro <hidden> · 2011-05-18
Re: [PATCH] mm: vmscan: Correctly check if reclaimer should schedule during shrink_slab · Minchan Kim <hidden> · 2011-05-19
Re: [PATCH] mm: vmscan: Correctly check if reclaimer should schedule during shrink_slab · Minchan Kim <hidden> · 2011-05-19
Re: [PATCH] mm: vmscan: Correctly check if reclaimer should schedule during shrink_slab · Colin Ian King <hidden> · 2011-05-19
Re: [PATCH] mm: vmscan: Correctly check if reclaimer should schedule during shrink_slab · Minchan Kim <hidden> · 2011-05-20
Re: [PATCH 4/4] mm: vmscan: If kswapd has been running too long, allow it to sleep · Minchan Kim <hidden> · 2011-05-18
Re: [PATCH 4/4] mm: vmscan: If kswapd has been running too long, allow it to sleep · Colin Ian King <hidden> · 2011-05-18
Re: [PATCH 4/4] mm: vmscan: If kswapd has been running too long, allow it to sleep · James Bottomley <James.Bottomley@HansenPartnership.com> · 2011-05-18
Re: [PATCH 4/4] mm: vmscan: If kswapd has been running too long, allow it to sleep · KOSAKI Motohiro <hidden> · 2011-05-18
Re: [PATCH 4/4] mm: vmscan: If kswapd has been running too long, allow it to sleep · Minchan Kim <hidden> · 2011-05-18
Re: [PATCH 4/4] mm: vmscan: If kswapd has been running too long, allow it to sleep · KOSAKI Motohiro <hidden> · 2011-05-18
Re: [PATCH 4/4] mm: vmscan: If kswapd has been running too long, allow it to sleep · Mel Gorman <mgorman@suse.de> · 2011-05-18
Re: [PATCH 4/4] mm: vmscan: If kswapd has been running too long, allow it to sleep · Minchan Kim <hidden> · 2011-05-18
Re: [PATCH 4/4] mm: vmscan: If kswapd has been running too long, allow it to sleep · KOSAKI Motohiro <hidden> · 2011-05-18
Re: [PATCH 4/4] mm: vmscan: If kswapd has been running too long, allow it to sleep · KOSAKI Motohiro <hidden> · 2011-05-18
Re: [PATCH 4/4] mm: vmscan: If kswapd has been running too long, allow it to sleep · Mel Gorman <mgorman@suse.de> · 2011-05-18
Re: [PATCH 4/4] mm: vmscan: If kswapd has been running too long, allow it to sleep · Mel Gorman <mgorman@suse.de> · 2011-05-16
Re: [PATCH 4/4] mm: vmscan: If kswapd has been running too long, allow it to sleep · Rik van Riel <hidden> · 2011-05-16
Re: [PATCH 0/4] Reduce impact to overall system of SLUB using high-order allocations V2 · James Bottomley <James.Bottomley@HansenPartnership.com> · 2011-05-13
Re: [PATCH 0/4] Reduce impact to overall system of SLUB using high-order allocations V2 · Mel Gorman <mgorman@suse.de> · 2011-05-13
Re: [PATCH 0/4] Reduce impact to overall system of SLUB using high-order allocations V2 · Christoph Lameter <hidden> · 2011-05-13
Re: [PATCH 0/4] Reduce impact to overall system of SLUB using high-order allocations V2 · Mel Gorman <mgorman@suse.de> · 2011-05-13
Re: [PATCH 0/4] Reduce impact to overall system of SLUB using high-order allocations V2 · Colin Ian King <hidden> · 2011-05-14
Re: [PATCH 0/4] Reduce impact to overall system of SLUB using high-order allocations V2 · Mel Gorman <mgorman@suse.de> · 2011-05-16
Re: [PATCH 0/4] Reduce impact to overall system of SLUB using high-order allocations V2 · Colin Ian King <hidden> · 2011-05-16

From: Minchan Kim <hidden>
Date: 2011-05-16 05:04:00
Also in: linux-fsdevel, linux-mm, lkml

On Mon, May 16, 2011 at 1:21 PM, James Bottomley
[off-list ref] wrote:

On Sun, 2011-05-15 at 19:27 +0900, KOSAKI Motohiro wrote:

quoted

(2011/05/13 23:03), Mel Gorman wrote:

quoted

Under constant allocation pressure, kswapd can be in the situation where
sleeping_prematurely() will always return true even if kswapd has been
running a long time. Check if kswapd needs to be scheduled.

Signed-off-by: Mel Gorman<mgorman@suse.de>
---
  mm/vmscan.c |    4 ++++
  1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index af24d1e..4d24828 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c

@@ -2251,6 +2251,10 @@ static bool sleeping_prematurely(pg_data_t *pgdat, int order, long remaining,

    unsigned long balanced = 0;
    bool all_zones_ok = true;

+   /* If kswapd has been running too long, just sleep */
+   if (need_resched())
+           return false;
+

Hmm... I don't like this patch so much. because this code does

- don't sleep if kswapd got context switch at shrink_inactive_list

This isn't entirely true:  need_resched() will be false, so we'll follow
the normal path for determining whether to sleep or not, in effect
leaving the current behaviour unchanged.

quoted

- sleep if kswapd didn't

This also isn't entirely true: whether need_resched() is true at this
point depends on a whole lot more that whether we did a context switch
in shrink_inactive. It mostly depends on how long we've been running
without giving up the CPU.  Generally that will mean we've been round
the shrinker loop hundreds to thousands of times without sleeping.

quoted

It seems to be semi random behavior.

Well, we have to do something.  Chris Mason first suspected the hang was
a kswapd rescheduling problem a while ago.  We tried putting
cond_rescheds() in several places in the vmscan code, but to no avail.

Is it a result of  test with patch of Hannes(ie, !pgdat_balanced)?

If it isn't, it would be nop regardless of putting cond_reshed at vmscan.c.
Because, although we complete zone balancing, kswapd doesn't sleep as
pgdat_balance returns wrong result. And at last VM calls
balance_pgdat. In this case, balance_pgdat returns without any work as
kswap couldn't find zones which have not enough free pages and goto
out. kswapd could repeat this work infinitely. So you don't have a
chance to call cond_resched.

But if your test was with Hanne's patch, I am very curious how come
kswapd consumes CPU a lot.

The need_resched() in sleeping_prematurely() seems to be about the best
option.  The other option might be just to put a cond_resched() in
kswapd_try_to_sleep(), but that will really have about the same effect.

I don't oppose it but before that, I think we have to know why kswapd
consumes CPU a lot although we applied Hannes' patch.

James


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>



-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help