Hello Xinchen.
On Tue, Sep 30, 2025 at 09:35:50AM +0000, Cai Xinchen [off-list ref] wrote:
I discovered that the DataNode process had requested a large amount
of page cache. most of the page cache was concentrated in one NUMA node,
ultimately leading to the exhaustion of memory in that NUMA node.
[...]
This issue can be resolved by migrating the DataNode into
a cpuset, dropping the cache, and setting cpuset.memory_spread_page to
allow it to evenly request memory.
Would it work in your case instead to apply memory.max or apply
MPOL_INTERLEAVE to DataNode process?
In anyway, please see commit 012c419f8d248 ("cgroup/cpuset-v1: Add
deprecation messages to memory_spread_page and memory_spread_slab")
since your patchset would need to touch that place(s) too.
Thanks,
Michal