Thread (15 messages) 15 messages, 5 authors, 2016-06-10

[BUG] Page allocation failures with newest kernels

From: Yehuda Yitschak <hidden>
Date: 2016-05-31 13:12:24
Also in: linux-mm, lkml

Hi Robin 

During some of the stress tests we also came across a different warning from the arm64  page management code
It looks like a race is detected between HW and SW marking a bit in the PTE

Not sure it's really related but I thought it might give a clue on the issue
http://pastebin.com/ASv19vZP

Thanks

Yehuda 

-----Original Message-----
From: Marcin Wojtas [mailto:mw at semihalf.com]
Sent: Tuesday, May 31, 2016 13:30
To: Robin Murphy
Cc: linux-mm at kvack.org; linux-kernel at vger.kernel.org; linux-arm-
kernel at lists.infradead.org; Lior Amsalem; Thomas Petazzoni; Yehuda
Yitschak; Catalin Marinas; Arnd Bergmann; Grzegorz Jaszczyk; Will Deacon;
Nadav Haklai; Tomasz Nowicki; Gregory Cl?ment
Subject: Re: [BUG] Page allocation failures with newest kernels

Hi Robin,
quoted
I remember there were some issues around 4.2 with the revision of the
arm64 atomic implementations affecting the cmpxchg_double() in SLUB,
but those should all be fixed (and the symptoms tended to be
considerably more fatal).
quoted
A stronger candidate would be 97303480753e (which landed in 4.4),
which has various knock-on effects on the layout of SLUB internals -
does fiddling with L1_CACHE_SHIFT make any difference?
I'll check the commits, thanks. I forgot to add L1_CACHE_SHIFT was my first
suspect - I had spent a long time debugging network controller, which
stopped working because of this change - L1_CACHE_BYTES (and hence
NET_SKB_PAD) not fitting HW constraints. Anyway reverting it didn't help at
all for page alloc issue.

Best regards,
Marcin
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help