[BUG] Page allocation failures with newest kernels
From: Yehuda Yitschak <hidden>
Date: 2016-05-31 13:12:24
Also in:
linux-mm, lkml
Hi Robin During some of the stress tests we also came across a different warning from the arm64 page management code It looks like a race is detected between HW and SW marking a bit in the PTE Not sure it's really related but I thought it might give a clue on the issue http://pastebin.com/ASv19vZP Thanks Yehuda
-----Original Message----- From: Marcin Wojtas [mailto:mw at semihalf.com] Sent: Tuesday, May 31, 2016 13:30 To: Robin Murphy Cc: linux-mm at kvack.org; linux-kernel at vger.kernel.org; linux-arm- kernel at lists.infradead.org; Lior Amsalem; Thomas Petazzoni; Yehuda Yitschak; Catalin Marinas; Arnd Bergmann; Grzegorz Jaszczyk; Will Deacon; Nadav Haklai; Tomasz Nowicki; Gregory Cl?ment Subject: Re: [BUG] Page allocation failures with newest kernels Hi Robin,quoted
I remember there were some issues around 4.2 with the revision of the arm64 atomic implementations affecting the cmpxchg_double() in SLUB, but those should all be fixed (and the symptoms tended to beconsiderably more fatal).quoted
A stronger candidate would be 97303480753e (which landed in 4.4), which has various knock-on effects on the layout of SLUB internals - does fiddling with L1_CACHE_SHIFT make any difference?I'll check the commits, thanks. I forgot to add L1_CACHE_SHIFT was my first suspect - I had spent a long time debugging network controller, which stopped working because of this change - L1_CACHE_BYTES (and hence NET_SKB_PAD) not fitting HW constraints. Anyway reverting it didn't help at all for page alloc issue. Best regards, Marcin