Re: [PATCH] mm: Teach pfn_to_online_page() about ZONE_DEVICE section collisions
From: Michal Hocko <mhocko@suse.com>
Date: 2021-01-06 11:39:08
Also in:
lkml
On Wed 06-01-21 12:22:57, David Hildenbrand wrote:
On 06.01.21 11:42, Michal Hocko wrote:quoted
On Wed 06-01-21 10:56:19, David Hildenbrand wrote: [...]quoted
Note that this is not sufficient in the general case. I already mentioned that we effectively override an already initialized memmap. --- [ SECTION ] Before: [ ZONE_NORMAL ][ Hole ] The hole has some node/zone (currently 0/0, discussions ongoing on how to optimize that to e.g., ZONE_NORMAL in this example) and is PG_reserved - looks like an ordinary memory hole. After memremap: [ ZONE_NORMAL ][ ZONE_DEVICE ] The already initialized memmap was converted to ZONE_DEVICE. Your slowpath will work. After memunmap (no poisioning): [ ZONE_NORMAL ][ ZONE_DEVICE ] The slow path is no longer working. pfn_to_online_page() might return something that is ZONE_DEVICE. After memunmap (poisioning): [ ZONE_NORMAL ][ POISONED ] The slow path is no longer working. pfn_to_online_page() might return something that will BUG_ON via page_to_nid() etc. --- Reason is that pfn_to_online_page() does no care about sub-sections. And for now, it didn't had to. If there was an online section, it either was a) Completely present. The whole memmap is initialized to sane values. b) Partially present. The whole memmap is initialized to sane values. memremap/memunmap messes with case b)I do not see we ever clear the newly added flag and my understanding is that the subsection removed would lead to get_dev_pagemap returning a NULL. Which would obviously need to be checked for pfn_to_online_page. Or do I miss anything and the above is not the case and we could still get false positives?See my example above ("After memunmap"). We're still in the slow pathg. pfn_to_online_page() will return a struct page as get_dev_pagemap() is now NULL.
Yeah, my bad. I have clearly misread the patch. We would need som other means than relying on get_dev_pagemap if it doesn't survive the memunmap. :/ -- Michal Hocko SUSE Labs