Re: [PATCH v2 1/2] memcg: flush stats only if updated
From: Michal Koutný <hidden>
Date: 2021-10-15 17:55:38
Also in:
linux-mm, lkml
On Thu, Oct 14, 2021 at 09:31:46AM -0700, Shakeel Butt [off-list ref] wrote:
Thanks for your review. This forces me to think more on this because each update does not necessarily be a single page sized update e.g. adding a hugepage to an LRU.
Aha, hugepages... (I also noticed that on the opposite size scale are
NR_SLAB_{UN,}RECLAIMABLE_B, the complementary problem to too big error
would be too frequent flushes.)
Though I think the error is time bounded by 2 seconds but in those 2 seconds mathematically the error can be large.
Honestly, I can't tell how much the (transient) errors in various node_stat_item entries will or won't affect MM behavior. But having some guards on it sounds practical when some problems to troubleshoot arise.
What do you think of the following change? It will bound the error
better within the 2 seconds window.
[...]
-static inline void memcg_rstat_updated(struct mem_cgroup *memcg)
+static inline void memcg_rstat_updated(struct mem_cgroup *memcg, int val)
{
+ unsigned int x;
+
cgroup_rstat_updated(memcg->css.cgroup, smp_processor_id());
- if (!(__this_cpu_inc_return(stats_updates) % MEMCG_CHARGE_BATCH))
- atomic_inc(&stats_flush_threshold);
+
+ x = abs(__this_cpu_add_return(stats_diff, val));
+ if (x > MEMCG_CHARGE_BATCH) {
+ atomic_add(x / MEMCG_CHARGE_BATCH, &stats_flush_threshold);
+ __this_cpu_write(stats_diff, 0);
+ }
}Looks better wrt meaningful error calculation (and hopefully still doesn't add too much to the hot path).
quoted hunk ↗ jump to hunk
@@ -807,7 +813,7 @@ void __count_memcg_events(struct mem_cgroup *memcg, enum vm_event_item idx, return; __this_cpu_add(memcg->vmstats_percpu->events[idx], count); - memcg_rstat_updated(memcg); + memcg_rstat_updated(memcg, val);
s/val/count/ (Just thinking loudly.) At one moment I thought, it could effectively be even s/val/0/ since events aren't(?) inputs for reclaim calculations. But with the introduced stats_diff it may happen stats_diff flickers (its abs value) within the MEMCG_CHARGE_BATCH and stats_flush_threshold would never be incremented. Basically disabling the periodic flush. Therefore the events must also increment the stats_diff.