Thread (108 messages) 108 messages, 20 authors, 2022-02-08

Re: [PATCH v6 6/9] mm: multigenerational lru: aging

From: Yu Zhao <hidden>
Date: 2022-01-13 09:43:49
Also in: linux-doc, linux-mm, lkml

On Mon, Jan 10, 2022 at 03:37:28PM +0100, Michal Hocko wrote:
On Sun 09-01-22 20:58:02, Yu Zhao wrote:
quoted
On Fri, Jan 07, 2022 at 10:00:31AM +0100, Michal Hocko wrote:
quoted
On Fri 07-01-22 09:55:09, Michal Hocko wrote:
[...]
quoted
quoted
In this case, lru_gen_mm_walk is small (160 bytes); it's per direct
reclaimer; and direct reclaimers rarely come here, i.e., only when
kswapd can't keep up in terms of the aging, which is similar to the
condition where the inactive list is empty for the active/inactive
lru.
Well, this is not a strong argument to be honest. Kswapd being stuck
and the majority of the reclaim being done in the direct reclaim
context is a situation I have seen many many times.
Also do not forget that memcg reclaim is effectivelly only direct
reclaim. Not that the memcg reclaim indicates a global memory shortage
but it can add up and race with the global reclaim as well.
I don't dispute any of the above, and I probably don't like this code
more than you do.

But let's not forget the purposes of PF_MEMALLOC, besides preventing
recursive reclaims, include letting reclaim dip into reserves so that
it can make more free memory. So I think it's acceptable if the
following conditions are met:
1. The allocation size is small.
2. The number of allocations is bounded.
3. Its failure doesn't stall reclaim.
And it'd be nice if
4. The allocation happens rarely, e.g., slow path only.
I would add 
  0. The allocation should be done only if absolutely _necessary_.

Please keep in mind that whatever you allocate from that context will be
consuming a very precious memory reserves which are shared with other
components of the system. Even worse these can go all the way to
depleting memory completely where other things can fall apart.
I agree but I also see a distinction:
   1,2,3 are objective;
   0,4 are subjective.

For some users, page reclaim itself could be not absolutely necessary
because they are okay with OOM kills. But for others, the situation
could be reversed.
quoted
The code in question meets all of them.

1. This allocation is 160 bytes.
2. It's bounded by the number of page table walkers which, in the
   worst, is same as the number of mm_struct's.
3. Most importantly, its failure doesn't stall the aging. The aging
   will fallback to the rmap-based function lru_gen_look_around().
   But this function only gathers the accessed bit from at most 64
   PTEs, meaning it's less efficient (retains ~80% performance gains).
4. This allocation is rare, i.e., only when the aging is required,
   which is similar to the low inactive case for the active/inactive
   lru.
I think this fallback behavior deserves much more detailed explanation
in changelogs.
Will do.
quoted
The bottom line is I can try various optimizations, e.g., preallocate
a few buffers for a limited number of page walkers and if this number
has been reached, fallback to the rmap-based function. But I have yet
to see evidence that calls for additional complexity.
I would disagree here. This is not an optimization. You should be
avoiding allocations from the memory reclaim because any allocation just
add a runtime behavior complexity and potential corner cases.
Would __GFP_NOMEMALLOC address your concern? It prevents allocations
from accessing the reserves even under PF_MEMALLOC.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help