Re: [PATCH 0/5] Candidate fix for increased number of GFP_ATOMIC failures V2
From: Mel Gorman <hidden>
Date: 2009-10-22 16:03:06
Also in:
linux-mm, lkml
On Thu, Oct 22, 2009 at 05:47:10PM +0300, Pekka Enberg wrote:
On Thu, Oct 22, 2009 at 5:22 PM, Mel Gorman [off-list ref] wrote:quoted
Test 1: Verify your problem occurs on 2.6.32-rc5 if you can Test 2: Apply the following two patches and test again 1/5 page allocator: Always wake kswapd when restarting an allocation attempt after direct reclaim failed 2/5 page allocator: Do not allow interrupts to use ALLOC_HARDERThese are pretty obvious bug fixes and should go to linux-next ASAP IMHO.
Agreed, but I wanted to pin down where exactly we stand with this problem before sending patches any direction for merging.
quoted
Test 5: If things are still screwed, apply the following 5/5 Revert 373c0a7e, 8aa7e847: Fix congestion_wait() sync/async vs read/write confusion Frans Pop reports that the bulk of his problems go away when this patch is reverted on 2.6.31. There has been some confusion on why exactly this patch was wrong but apparently the conversion was not complete and further work was required. It's unknown if all the necessary work exists in 2.6.31-rc5 or not. If there are still allocation failures and applying this patch fixes the problem, there are still snags that need to be ironed out.As explained by Jens Axboe, this changes timing but is not the source of the OOMs so the revert is bogus even if it "helps" on some workloads. IIRC the person who reported the revert to help things did report that the OOMs did not go away, they were simply harder to trigger with the revert.
IIRC, there were mixed reports as to how much the revert helped. I'm hoping that patches 1+2 cover the bases hence why I asked them to be tested on their own. Patch 2 in particular might be responsible for watermarks being impacted enough to cause timing problems. I left reverting with patch 5 as a standalone test to see how much of a factor the timing changes introduced are if there are still allocation problems. -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab