Thread (15 messages) 15 messages, 5 authors, 2016-06-10

[BUG] Page allocation failures with newest kernels

From: Marcin Wojtas <hidden>
Date: 2016-06-10 16:08:09
Also in: linux-mm, lkml

Hi Mel,

Thanks for posting patch. I tested it on LKv4.4.8. Despite
"mode:0x2284020" shows that __GFP_ATOMIC is now not stripped, the
issue remains:
http://pastebin.com/DmezUJSc

Best regards,
Marcin

2016-06-09 20:13 GMT+02:00 Marcin Wojtas [off-list ref]:
Hi Mel,

My last email got cut in half.

2016-06-08 12:09 GMT+02:00 Mel Gorman [off-list ref]:
quoted
On Tue, Jun 07, 2016 at 07:36:57PM +0200, Marcin Wojtas wrote:
quoted
Hi Mel,



2016-06-03 14:36 GMT+02:00 Mel Gorman [off-list ref]:
quoted
On Fri, Jun 03, 2016 at 01:57:06PM +0200, Marcin Wojtas wrote:
quoted
quoted
quoted
For the record: the newest kernel I was able to reproduce the dumps
was v4.6: http://pastebin.com/ekDdACn5. I've just checked v4.7-rc1,
which comprise a lot (mainly yours) changes in mm, and I'm wondering
if there may be a spot fix or rather a series of improvements. I'm
looking forward to your opinion and would be grateful for any advice.
I don't believe we want to reintroduce the reserve to cope with CMA. One
option would be to widen the gap between low and min watermark by the
size of the CMA region. The effect would be to wake kswapd earlier which
matters considering the context of the failing allocation was
GFP_ATOMIC.
Of course my intention is not reintroducing anything that's gone
forever, but just to find out way to overcome current issues. Do you
mean increasing CMA size?
No. There is a gap between the low and min watermarks. At the low point,
kswapd is woken up and at the min point allocation requests either
either direct reclaim or fail if they are atomic. What I'm suggesting
is that you adjust the low watermark and add the size of the CMA area
to it so that kswapd is woken earlier. The watermarks are calculated in
__setup_per_zone_wmarks
I printed all zones' settings, whose watermarks are configured within
__setup_per_zone_wmarks(). There are three DMA, Normal and Movable -
only first one's watermarks have non-zero values. Increasing DMA min
watermark didn't help. I also played with increasing
Patch?
I played with increasing min_free_kbytes from ~2600 to 16000. It
resulted in shifting watermarks levels in __setup_per_zone_wmarks(),
however only for zone DMA. Normal and Movable remained at 0. No
progress with avoiding page alloc failures - a gap between 'free' and
'free_cma' was huge, so I don't think that CMA itself would be a root
cause.
quoted
Did you establish why GFP_ATOMIC (assuming that's the failing site) had
not specified __GFP_ATOMIC at the time of the allocation failure?
Yes. It happens in new_slab() in following lines:
return allocate_slab(s, flags & (GFP_RECLAIM_MASK | GFP_CONSTRAINT_MASK), node);
I added "| GFP_ATOMIC" and in such case I got same dumps but with one
bit set more in gfp_mask, so I don't think it's an issue.

Latest patches in v4.7-rc1 seem to boost page alloc performance enough
to avoid problems observed between v4.2 and v4.6. Hence before
rebasing from v4.4 to another LTS >v4.7 in future, we decided as a WA
to return to using MIGRATE_RESERVE + adding fix for
early_page_nid_uninitialised(). Now operation seems stable on all our
SoC's during the tests.

Best regards,
Marcin
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help