Re: [PATCH 4/4] mm: vmscan: If kswapd has been running too long, allow it to sleep
From: Minchan Kim <hidden>
Date: 2011-05-16 05:04:00
Also in:
linux-fsdevel, linux-mm, lkml
On Mon, May 16, 2011 at 1:21 PM, James Bottomley [off-list ref] wrote:
On Sun, 2011-05-15 at 19:27 +0900, KOSAKI Motohiro wrote:quoted
(2011/05/13 23:03), Mel Gorman wrote:quoted
Under constant allocation pressure, kswapd can be in the situation where sleeping_prematurely() will always return true even if kswapd has been running a long time. Check if kswapd needs to be scheduled. Signed-off-by: Mel Gorman<mgorman@suse.de> --- mm/vmscan.c | 4 ++++ 1 files changed, 4 insertions(+), 0 deletions(-)diff --git a/mm/vmscan.c b/mm/vmscan.c index af24d1e..4d24828 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c@@ -2251,6 +2251,10 @@ static bool sleeping_prematurely(pg_data_t *pgdat, int order, long remaining,unsigned long balanced = 0; bool all_zones_ok = true; + /* If kswapd has been running too long, just sleep */ + if (need_resched()) + return false; +Hmm... I don't like this patch so much. because this code does - don't sleep if kswapd got context switch at shrink_inactive_listThis isn't entirely true: need_resched() will be false, so we'll follow the normal path for determining whether to sleep or not, in effect leaving the current behaviour unchanged.quoted
- sleep if kswapd didn'tThis also isn't entirely true: whether need_resched() is true at this point depends on a whole lot more that whether we did a context switch in shrink_inactive. It mostly depends on how long we've been running without giving up the CPU. Generally that will mean we've been round the shrinker loop hundreds to thousands of times without sleeping.quoted
It seems to be semi random behavior.Well, we have to do something. Chris Mason first suspected the hang was a kswapd rescheduling problem a while ago. We tried putting cond_rescheds() in several places in the vmscan code, but to no avail.
Is it a result of test with patch of Hannes(ie, !pgdat_balanced)? If it isn't, it would be nop regardless of putting cond_reshed at vmscan.c. Because, although we complete zone balancing, kswapd doesn't sleep as pgdat_balance returns wrong result. And at last VM calls balance_pgdat. In this case, balance_pgdat returns without any work as kswap couldn't find zones which have not enough free pages and goto out. kswapd could repeat this work infinitely. So you don't have a chance to call cond_resched. But if your test was with Hanne's patch, I am very curious how come kswapd consumes CPU a lot.
The need_resched() in sleeping_prematurely() seems to be about the best option. The other option might be just to put a cond_resched() in kswapd_try_to_sleep(), but that will really have about the same effect.
I don't oppose it but before that, I think we have to know why kswapd consumes CPU a lot although we applied Hannes' patch.
James -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
-- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>