Thread (13 messages) 13 messages, 3 authors, 2021-01-12

Re: [PATCH] mm: Teach pfn_to_online_page() about ZONE_DEVICE section collisions

From: David Hildenbrand <hidden>
Date: 2021-01-06 11:24:33
Also in: lkml

On 06.01.21 11:42, Michal Hocko wrote:
On Wed 06-01-21 10:56:19, David Hildenbrand wrote:
[...]
quoted
Note that this is not sufficient in the general case. I already
mentioned that we effectively override an already initialized memmap.

---

[        SECTION             ]
Before:
[ ZONE_NORMAL ][    Hole     ]

The hole has some node/zone (currently 0/0, discussions ongoing on how
to optimize that to e.g., ZONE_NORMAL in this example) and is
PG_reserved - looks like an ordinary memory hole.

After memremap:
[ ZONE_NORMAL ][ ZONE_DEVICE ]

The already initialized memmap was converted to ZONE_DEVICE. Your
slowpath will work.

After memunmap (no poisioning):
[ ZONE_NORMAL ][ ZONE_DEVICE ]

The slow path is no longer working. pfn_to_online_page() might return
something that is ZONE_DEVICE.

After memunmap (poisioning):
[ ZONE_NORMAL ][ POISONED    ]

The slow path is no longer working. pfn_to_online_page() might return
something that will BUG_ON via page_to_nid() etc.

---

Reason is that pfn_to_online_page() does no care about sub-sections. And
for now, it didn't had to. If there was an online section, it either was

a) Completely present. The whole memmap is initialized to sane values.
b) Partially present. The whole memmap is initialized to sane values.

memremap/memunmap messes with case b)
I do not see we ever clear the newly added flag and my understanding is
that the subsection removed would lead to get_dev_pagemap returning a
NULL. Which would obviously need to be checked for pfn_to_online_page.
Or do I miss anything and the above is not the case and we could still
get false positives?
See my example above ("After memunmap").

We're still in the slow pathg. pfn_to_online_page() will return a struct
page as get_dev_pagemap() is now  NULL.

Yet page_zone(page) will either
- BUG_ON (memmap was poisoned)
- return ZONE_DEVICE zone (memmap not poisoned when memunmapping)

As I said, can be tackled by checking for pfn_section_valid() at least
on the slow path. Ideally also on the fast path.
quoted
Well have to further tweak pfn_to_online_page(). You'll have to also
check pfn_section_valid() *at least* on the slow path. Less-hacky would
be checking it also in the "somehwat-faster" path - that would cover
silently overriding a memmap that's visible via pfn_to_online_page().
Might slow down things a bit.


Not completely opposed to this, but I would certainly still prefer just
avoiding this corner case completely instead of patching around it. Thanks!
Well, I would love to have no surprises either. So far there was not
actual argument why the pmem reserved space cannot be fully initialized.
Yes, I'm still hoping Dan can clarify that.
On the other hand making sure that pfn_to_online_page sounds like the
right thing to do. And having an explicit check for zone device there in
a slow path makes sense to me.
As I said, I'd favor to simplify and just get rid of the special case,
instead of coming up with increasingly complex ways to deal with it.
pfn_to_online_page() used to be simple, essentially checking a single
flag was sufficient in most setups.

-- 
Thanks,

David / dhildenb

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help