Re: Kernel falls apart under light memory pressure (i.e. linking vmlinux)
From: Andrew Lutomirski <hidden>
Date: 2011-05-22 12:22:49
Also in:
lkml
Possibly related (same subject, not in this thread)
- 2011-05-14 · Re: Kernel falls apart under light memory pressure (i.e. linking vmlinux) · Andi Kleen <hidden>
- 2011-05-14 · Re: Kernel falls apart under light memory pressure (i.e. linking vmlinux) · Andrew Lutomirski <hidden>
On Sat, May 21, 2011 at 10:44 AM, Minchan Kim [off-list ref] wrote:
quoted hunk ↗ jump to hunk
Hi Andrew. On Sat, May 21, 2011 at 10:34 PM, Andrew Lutomirski [off-list ref] wrote:quoted
On Sat, May 21, 2011 at 8:04 AM, KOSAKI Motohiro [off-list ref] wrote:quoted
quoted
diff --git a/mm/vmscan.c b/mm/vmscan.c index 3f44b81..d1dabc9 100644@@ -1426,8 +1437,13 @@ shrink_inactive_list(unsigned long nr_to_scan,struct zone *zone, /* Check if we should syncronously wait for writeback */ if (should_reclaim_stall(nr_taken, nr_reclaimed, priority, sc)) { + unsigned long nr_active, old_nr_scanned; set_reclaim_mode(priority, sc, true); + nr_active = clear_active_flags(&page_list, NULL); + count_vm_events(PGDEACTIVATE, nr_active); + old_nr_scanned = sc->nr_scanned; nr_reclaimed += shrink_page_list(&page_list, zone, sc); + sc->nr_scanned = old_nr_scanned; } local_irq_disable(); I just tested 2.6.38.6 with the attached patch. It survived dirty_ram and test_mempressure without any problems other than slowness, but when I hit ctrl-c to stop test_mempressure, I got the attached oom.Minchan, I'm confused now. If pages got SetPageActive(), should_reclaim_stall() should never return true. Can you please explain which bad scenario was happen? ----------------------------------------------------------------------------------------------------- static void reset_reclaim_mode(struct scan_control *sc) { sc->reclaim_mode = RECLAIM_MODE_SINGLE | RECLAIM_MODE_ASYNC; } shrink_page_list() { (snip) activate_locked: SetPageActive(page); pgactivate++; unlock_page(page); reset_reclaim_mode(sc); /// here list_add(&page->lru, &ret_pages); } ----------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------------- bool should_reclaim_stall() { (snip) /* Only stall on lumpy reclaim */ if (sc->reclaim_mode & RECLAIM_MODE_SINGLE) /// and here return false; -----------------------------------------------------------------------------------------------------I did some tracing and the oops happens from the second call to shrink_page_list after should_reclaim_stall returns true and it hits the same pages in the same order that the earlier call just finished calling SetPageActive on. I have *not* confirmed that the two calls happened from the same call to shrink_inactive_list, but something's certainly wrong in there. This is very easy to reproduce on my laptop.I would like to confirm this problem. Could you show the diff of 2.6.38.6 with current your 2.6.38.6 + alpha? (ie, I would like to know that what patches you add up on vanilla 2.6.38.6 to reproduce this problem) I believe you added my crap below patch. Right?diff --git a/mm/vmscan.c b/mm/vmscan.c index 292582c..69d317e 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c@@ -311,7 +311,8 @@ static void set_reclaim_mode(int priority, structscan_control *sc, */ if (sc->order > PAGE_ALLOC_COSTLY_ORDER) sc->reclaim_mode |= syncmode; - else if (sc->order && priority < DEF_PRIORITY - 2) + else if ((sc->order && priority < DEF_PRIORITY - 2) || + prioiry <= DEF_PRIORITY / 3) sc->reclaim_mode |= syncmode; else sc->reclaim_mode = RECLAIM_MODE_SINGLE | RECLAIM_MODE_ASYNC;@@ -1349,10 +1350,6 @@ static inline boolshould_reclaim_stall(unsigned long nr_taken, if (current_is_kswapd()) return false; - /* Only stall on lumpy reclaim */ - if (sc->reclaim_mode & RECLAIM_MODE_SINGLE) - return false; -
Bah. It's this last hunk. Without this I can't reproduce the oops. With this hunk, the reset_reclaim_mode doesn't work and shrink_page_list is incorrectly called twice. So we're back to the original problem... --Andy -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>