Re: [Lsf-pc] [LSF/MM/BPF TOPIC][RFC PATCH v4 00/27] Private Memory Nodes (w/ Compressed RAM)
From: "David Hildenbrand (Arm)" <david@kernel.org>
Date: 2026-06-15 15:19:13
Also in:
cgroups, damon, linux-cxl, lkml
On 6/15/26 16:38, Vlastimil Babka (SUSE) wrote:
On 6/12/26 17:29, Gregory Price wrote:quoted
On Wed, Jun 10, 2026 at 04:12:52PM -0400, Gregory Price wrote:quoted
... snip ... I will still probably send the next RFC version tomorrow or friday, as I want to get some eyes on the __GFP_PRIVATE-less pattern. Also, I made a new `anondax` driver which enables userland testing of this functionality without any specialty hardware.(apologies for the length of this email: this will all be covered in the coming cover letter, but I just wanted to share a bit of a preview) === Just another small update - I am planning to post the RFC today once i get some mild cleanup done. It will be based on the dax atomic hotplug https://lore.kernel.org/linux-mm/20260605211911.2160954-1-gourry@gourry.net/ (local) But a couple specific details regarding the memalloc pieces that i've learned the past couple of days playing with it. 1) memalloc_folio is required to ensure non-folio allocations don't land on the private node, even if it happens within a memalloc_private context. Since memalloc_folio may be useful in contexts outside of private nodes, I kept this as a separate flag. If we think there will *never* be additional users of memalloc_folio, then we could fold _folio into _private to save the flag for now and add it back when we actually need it. 2) memalloc_private is needed to unlock private nodes, but in the original NOFALLBACK-only design, you also needed __GFP_THISNODE. This is *highly* restrictive. I found when playing with mbind that MPOL_BIND + __GFP_THISNODE generates a WARN (valid WARN, it normally implies a bug). That leads me to #3I think the memalloc approach is dangerous due to unexpected nesting. There might be nested page allocations in page allocation itself (due to some debugging option). But also interrupts do not change what "current" points to. Suddenly those could start requesting folios and/or private nodes and be surprised, I'm afraid.
Yeah, we'd need some way to distinguish the main allocation from these other (nested) allocations.
The memalloc scopes only work well when they restrict the context wrt reclaim, and allocations in IRQ have to be already restricted heavily (atomic) so further memalloc restrictions don't do anything in practice. But to make them change other aspects of the allocations like this won't work.
I was assuming that memalloc_pin_save() would already violate that, but really it only restricts where movable allocations land, and that doesn't matter for other kernel allocations. Do you see any other way to make something like an allocation context work, and avoid introducing more GFP flags? -- Cheers, David