Thread (2 messages) 2 messages, 2 authors, 2022-11-30

Re: [RFC PATCH V1] mm: Disable demotion from proactive reclaim

From: Huang, Ying <hidden>
Date: 2022-11-30 05:40:18
Also in: cgroups, lkml

Mina Almasry [off-list ref] writes:
On Wed, Nov 23, 2022 at 9:52 PM Huang, Ying [off-list ref] wrote:
quoted
Hi, Johannes,

Johannes Weiner [off-list ref] writes:
[...]
quoted
The fallback to reclaim actually strikes me as wrong.

Think of reclaim as 'demoting' the pages to the storage tier. If we
have a RAM -> CXL -> storage hierarchy, we should demote from RAM to
CXL and from CXL to storage. If we reclaim a page from RAM, it means
we 'demote' it directly from RAM to storage, bypassing potentially a
huge amount of pages colder than it in CXL. That doesn't seem right.

If demotion fails, IMO it shouldn't satisfy the reclaim request by
breaking the layering. Rather it should deflect that pressure to the
lower layers to make room. This makes sure we maintain an aging
pipeline that honors the memory tier hierarchy.
Yes.  I think that we should avoid to fall back to reclaim as much as
possible too.  Now, when we allocate memory for demotion
(alloc_demote_page()), __GFP_KSWAPD_RECLAIM is used.  So, we will trigger
I may be missing something but as far I can tell reclaim is disabled
for allocations from lower tier memory:
https://elixir.bootlin.com/linux/v6.1-rc7/source/mm/vmscan.c#L1583
#define GFP_NOWAIT	(__GFP_KSWAPD_RECLAIM)

We have GFP_NOWAIT set in gfp.
I think this is maybe a good thing when doing proactive demotion. In
this case we probably don't want to try to reclaim from lower tier
nodes and instead fail the proactive demotion.
Do you have some real use cases for this?  If so, we can tweak the
logic.
However I can see this being desirable when the top tier nodes are
under real memory pressure to deflect that pressure to the lower tier
nodes.
Yes.

Best Regards,
Huang, Ying
quoted
kswapd reclaim on lower tier node to free some memory to avoid fall back
to reclaim on current (higher tier) node.  This may be not good enough,
for example, the following patch from Hasan may help via waking up
kswapd earlier.

https://lore.kernel.org/linux-mm/b45b9bf7cd3e21bca61d82dcd1eb692cd32c122c.1637778851.git.hasanalmaruf@fb.com/ (local)

Do you know what is the next step plan for this patch?

Should we do even more?

From another point of view, I still think that we can use falling back
to reclaim as the last resort to avoid OOM in some special situations,
for example, most pages in the lowest tier node are mlock() or too hot
to be reclaimed.
quoted
So I'm hesitant to design cgroup controls around the current behavior.
I sent RFC v2 patch:
https://lore.kernel.org/linux-mm/20221130020328.1009347-1-almasrymina@google.com/T/#u (local)

Please take a look when convenient. Thanks!
quoted
quoted
Best Regards,
Huang, Ying
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help