Re: [PATCH 0/5] Candidate fix for increased number of GFP_ATOMIC failures V2
From: Pekka Enberg <hidden>
Date: 2009-10-22 14:47:11
Also in:
linux-mm, lkml
On Thu, Oct 22, 2009 at 5:22 PM, Mel Gorman [off-list ref] wrote:
Test 1: Verify your problem occurs on 2.6.32-rc5 if you can Test 2: Apply the following two patches and test again 1/5 page allocator: Always wake kswapd when restarting an allocation attempt after direct reclaim failed 2/5 page allocator: Do not allow interrupts to use ALLOC_HARDER
These are pretty obvious bug fixes and should go to linux-next ASAP IMHO.
Test 5: If things are still screwed, apply the following 5/5 Revert 373c0a7e, 8aa7e847: Fix congestion_wait() sync/async vs read/write confusion Frans Pop reports that the bulk of his problems go away when this patch is reverted on 2.6.31. There has been some confusion on why exactly this patch was wrong but apparently the conversion was not complete and further work was required. It's unknown if all the necessary work exists in 2.6.31-rc5 or not. If there are still allocation failures and applying this patch fixes the problem, there are still snags that need to be ironed out.
As explained by Jens Axboe, this changes timing but is not the source of the OOMs so the revert is bogus even if it "helps" on some workloads. IIRC the person who reported the revert to help things did report that the OOMs did not go away, they were simply harder to trigger with the revert.