Thread (35 messages) 35 messages, 4 authors, 2021-03-08

Re: [PATCH v2 3/3] mm: Fix missing mem cgroup soft limit tree updates

From: Michal Hocko <hidden>
Date: 2021-02-19 09:18:19
Also in: linux-mm, lkml

On Wed 17-02-21 12:41:36, Tim Chen wrote:
On a per node basis, the mem cgroup soft limit tree on each node tracks
how much a cgroup has exceeded its soft limit memory limit and sorts
the cgroup by its excess usage.  On page release, the trees are not
updated right away, until we have gathered a batch of pages belonging to
the same cgroup. This reduces the frequency of updating the soft limit tree
and locking of the tree and associated cgroup.

However, the batch of pages could contain pages from multiple nodes but
only the soft limit tree from one node would get updated.  Change the
logic so that we update the tree in batch of pages, with each batch of
pages all in the same mem cgroup and memory node.  An update is issued for
the batch of pages of a node collected till now whenever we encounter
a page belonging to a different node.  Note that this batching for
the same node logic is only relevant for v1 cgroup that has a memory
soft limit.
Let me paste the discussion related to this patch from other reply:
quoted hunk ↗ jump to hunk
quoted
quoted
For patch 3 regarding the uncharge_batch, it
is more of an observation that we should uncharge in batch of same node
and not prompted by actual workload.
Thinking more about this, the worst that could happen
is we could have some entries in the soft limit tree that overestimate
the memory used.  The worst that could happen is a soft page reclaim
on that cgroup.  The overhead from extra memcg event update could
be more than a soft page reclaim pass.  So let's drop patch 3
for now.
I would still prefer to handle that in the soft limit reclaim path and
check each memcg for the soft limit reclaim excess before the reclaim.
Something like this?
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 8bddee75f5cb..b50cae3b2a1a 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3472,6 +3472,14 @@ unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order,
                if (!mz)
                        break;

+               /*
+                * Soft limit tree is updated based on memcg events sampling.
+                * We could have missed some updates on page uncharge and
+                * the cgroup is below soft limit.  Skip useless soft reclaim.
+                */
+               if (!soft_limit_excess(mz->memcg))
+                       continue;
+
                nr_scanned = 0;
                reclaimed = mem_cgroup_soft_reclaim(mz->memcg, pgdat,
Yes I meant something like this but then I have looked more closely and
this shouldn't be needed afterall. __mem_cgroup_largest_soft_limit_node
already does all the work
        if (!soft_limit_excess(mz->memcg) ||
            !css_tryget(&mz->memcg->css))
                goto retry;
so this shouldn't really happen.
-- 
Michal Hocko
SUSE Labs
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help