Thread (14 messages) 14 messages, 8 authors, 2021-08-24

Re: [PATCH] mm: memcontrol: fix occasional OOMs due to proportional memory.low reclaim

From: Shakeel Butt <hidden>
Date: 2021-08-17 19:10:36
Also in: cgroups, lkml

On Tue, Aug 17, 2021 at 11:03 AM Johannes Weiner [off-list ref] wrote:
We've noticed occasional OOM killing when memory.low settings are in
effect for cgroups. This is unexpected and undesirable as memory.low
is supposed to express non-OOMing memory priorities between cgroups.

The reason for this is proportional memory.low reclaim. When cgroups
are below their memory.low threshold, reclaim passes them over in the
first round, and then retries if it couldn't find pages anywhere else.
But when cgroups are slighly above their memory.low setting, page scan
*slightly
force is scaled down and diminished in proportion to the overage, to
the point where it can cause reclaim to fail as well - only in that
case we currently don't retry, and instead trigger OOM.

To fix this, hook proportional reclaim into the same retry logic we
have in place for when cgroups are skipped entirely. This way if
reclaim fails and some cgroups were scanned with dimished pressure,
*diminished
we'll try another full-force cycle before giving up and OOMing.

Reported-by: Leon Yang <redacted>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Should this be considered for stable?

Reviewed-by: Shakeel Butt <redacted>

[...]
quoted hunk ↗ jump to hunk
 static inline void mem_cgroup_calculate_protection(struct mem_cgroup *root,
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 4620df62f0ff..701106e1829c 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -100,9 +100,12 @@ struct scan_control {
        unsigned int may_swap:1;

        /*
-        * Cgroups are not reclaimed below their configured memory.low,
-        * unless we threaten to OOM. If any cgroups are skipped due to
-        * memory.low and nothing was reclaimed, go back for memory.low.
+        * Cgroup memory below memory.low is protected as long as we
+        * don't threaten to OOM. If any cgroup is reclaimed at
+        * reduced force or passed over entirely due to its memory.low
+        * setting (memcg_low_skipped), and nothing is reclaimed as a
+        * result, then go back back for one more cycle that reclaims
*back
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help