Re: [Lsf-pc] [LSF/MM/BPF TOPIC][RFC PATCH v4 00/27] Private Memory Nodes (w/ Compressed RAM)
From: Gregory Price <gourry@gourry.net>
Date: 2026-06-15 15:21:03
Also in:
cgroups, damon, linux-cxl, lkml
On Mon, Jun 15, 2026 at 04:38:43PM +0200, Vlastimil Babka (SUSE) wrote:
On 6/12/26 17:29, Gregory Price wrote:quoted
1) memalloc_folio is required to ensure non-folio allocations don't land on the private node, even if it happens within a memalloc_private context. Since memalloc_folio may be useful in contexts outside of private nodes, I kept this as a separate flag. If we think there will *never* be additional users of memalloc_folio, then we could fold _folio into _private to save the flag for now and add it back when we actually need it. 2) memalloc_private is needed to unlock private nodes, but in the original NOFALLBACK-only design, you also needed __GFP_THISNODE. This is *highly* restrictive. I found when playing with mbind that MPOL_BIND + __GFP_THISNODE generates a WARN (valid WARN, it normally implies a bug). That leads me to #3I think the memalloc approach is dangerous due to unexpected nesting. There might be nested page allocations in page allocation itself (due to some debugging option). But also interrupts do not change what "current" points to. Suddenly those could start requesting folios and/or private nodes and be surprised, I'm afraid. The memalloc scopes only work well when they restrict the context wrt reclaim, and allocations in IRQ have to be already restricted heavily (atomic) so further memalloc restrictions don't do anything in practice. But to make them change other aspects of the allocations like this won't work.
Reduced to practice I have found success, however what you are describing could probably be resolved by re-introducing fallback list isolation. If private nodes are not in fallback lists, and they're not N_MEMORY, then they're unreachable via nodemask-fallbacks, and a specific node has to be requested. For everything else memalloc locks them out regardless. In v5 I actually stripped this all the way back to just memalloc flags and implemented a bunch of pressure tests to try to detect leakage - and I was not able to do so - even with all nodes in each other's fallback lists. We can tack on both fallback list isolation and __GFP_THISNODE requirements on top without ABI implications if we find that is insufficient. The only place I think this will matter is in the reclaim / demotion code, would need to rework the allocation code to handle private nodes more explicitly. This has no ABI implications AND the entire demotion logic in vmscan.c is utterly broken anyway and needs a rewrite. I'm running a mass build test at the moment, and it's looking clean, I'm expecting to be able to test the new code today or tomorrow. ~Gregory