Thread (98 messages) 98 messages, 12 authors, 58m ago

Re: [Lsf-pc] [LSF/MM/BPF TOPIC][RFC PATCH v4 00/27] Private Memory Nodes (w/ Compressed RAM)

From: Gregory Price <gourry@gourry.net>
Date: 2026-06-15 15:21:03
Also in: cgroups, damon, linux-cxl, lkml

On Mon, Jun 15, 2026 at 04:38:43PM +0200, Vlastimil Babka (SUSE) wrote:
On 6/12/26 17:29, Gregory Price wrote:
quoted
1) memalloc_folio is required to ensure non-folio allocations don't land
   on the private node, even if it happens within a memalloc_private
   context.  Since memalloc_folio may be useful in contexts outside of
   private nodes, I kept this as a separate flag.

   If we think there will *never* be additional users of memalloc_folio,
   then we could fold _folio into _private to save the flag for now and
   add it back when we actually need it.

2) memalloc_private is needed to unlock private nodes, but in the
   original NOFALLBACK-only design, you also needed __GFP_THISNODE.

   This is *highly* restrictive.  I found when playing with mbind that
   MPOL_BIND + __GFP_THISNODE generates a WARN (valid WARN, it normally
   implies a bug). 

   That leads me to #3
I think the memalloc approach is dangerous due to unexpected nesting. There
might be nested page allocations in page allocation itself (due to some
debugging option). But also interrupts do not change what "current" points
to. Suddenly those could start requesting folios and/or private nodes and be
surprised, I'm afraid.

The memalloc scopes only work well when they restrict the context wrt
reclaim, and allocations in IRQ have to be already restricted heavily
(atomic) so further memalloc restrictions don't do anything in practice. But
to make them change other aspects of the allocations like this won't work.
Reduced to practice I have found success, however what you are
describing could probably be resolved by re-introducing fallback list
isolation.  If private nodes are not in fallback lists, and they're not
N_MEMORY, then they're unreachable via nodemask-fallbacks, and a
specific node has to be requested.  For everything else memalloc locks
them out regardless.

In v5 I actually stripped this all the way back to just memalloc flags
and implemented a bunch of pressure tests to try to detect leakage - and
I was not able to do so - even with all nodes in each other's fallback
lists.

We can tack on both fallback list isolation and __GFP_THISNODE
requirements on top without ABI implications if we find that is
insufficient.

The only place I think this will matter is in the reclaim / demotion
code, would need to rework the allocation code to handle private nodes
more explicitly.  This has no ABI implications AND the entire demotion
logic in vmscan.c is utterly broken anyway and needs a rewrite.

I'm running a mass build test at the moment, and it's looking clean, I'm
expecting to be able to test the new code today or tomorrow.

~Gregory
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help