Re: [PATCH] memcg: remove unneeded preempt_disable
From: Greg Thelen <hidden>
Date: 2011-08-19 00:00:58
Also in:
linux-mm, lkml
On Thu, Aug 18, 2011 at 2:40 PM, Andrew Morton [off-list ref] wrote:
(cc linux-arch) On Wed, 17 Aug 2011 23:50:53 -0700 Greg Thelen [off-list ref] wrote:quoted
Both mem_cgroup_charge_statistics() and mem_cgroup_move_account() were unnecessarily disabling preemption when adjusting per-cpu counters: preempt_disable() __this_cpu_xxx() __this_cpu_yyy() preempt_enable() This change does not disable preemption and thus CPU switch is possible within these routines. This does not cause a problem because the total of all cpu counters is summed when reporting stats. Now both mem_cgroup_charge_statistics() and mem_cgroup_move_account() look like: this_cpu_xxx() this_cpu_yyy() ...--- a/mm/memcontrol.c +++ b/mm/memcontrol.c@@ -664,24 +664,20 @@ static unsigned long mem_cgroup_read_events(struct mem_cgroup *mem,static void mem_cgroup_charge_statistics(struct mem_cgroup *mem, bool file, int nr_pages) { - preempt_disable(); - if (file) - __this_cpu_add(mem->stat->count[MEM_CGROUP_STAT_CACHE], nr_pages); + this_cpu_add(mem->stat->count[MEM_CGROUP_STAT_CACHE], nr_pages); else - __this_cpu_add(mem->stat->count[MEM_CGROUP_STAT_RSS], nr_pages); + this_cpu_add(mem->stat->count[MEM_CGROUP_STAT_RSS], nr_pages); /* pagein of a big page is an event. So, ignore page size */ if (nr_pages > 0) - __this_cpu_inc(mem->stat->events[MEM_CGROUP_EVENTS_PGPGIN]); + this_cpu_inc(mem->stat->events[MEM_CGROUP_EVENTS_PGPGIN]); else { - __this_cpu_inc(mem->stat->events[MEM_CGROUP_EVENTS_PGPGOUT]); + this_cpu_inc(mem->stat->events[MEM_CGROUP_EVENTS_PGPGOUT]); nr_pages = -nr_pages; /* for event */ } - __this_cpu_add(mem->stat->events[MEM_CGROUP_EVENTS_COUNT], nr_pages); - - preempt_enable(); + this_cpu_add(mem->stat->events[MEM_CGROUP_EVENTS_COUNT], nr_pages); }On non-x86 architectures this_cpu_add() internally does preempt_disable() and preempt_enable(). So the patch is a small optimisation for x86 and a larger deoptimisation for non-x86. I think I'll apply it, as the call frequency is low (correct?) and the problem will correct itself as other architectures implement their atomic this_cpu_foo() operations.
mem_cgroup_charge_statistics() is a common operation, which is called on each memcg page charge and uncharge. The per arch/config effects of this patch: * non-preemptible kernels: there's no difference before/after this patch. * preemptible x86: this patch helps by removing an unnecessary preempt_disable/enable. * preemptible non-x86: this patch hurts by adding implicit preempt_disable/enable around each operation. So I am uncomfortable this patch's unmeasured impact on archs that do not have atomic this_cpu_foo() operations. Please drop the patch from mmotm. Sorry for the noise.