Re: [PATCH 1/4] memcg: simplify consume_stock

From: Vlastimil Babka <hidden>
Date: 2025-05-07 11:42:35
Also in: bpf, cgroups, linux-mm, lkml

On 5/7/25 12:55 AM, Shakeel Butt wrote:

The consume_stock() does not need to check gfp_mask for spinning and can
simply trylock the local lock to decide to proceed or fail. No need to
spin at all for local lock.

One of the concern raised was that on PREEMPT_RT kernels, this trylock
can fail more often due to tasks having lock_lock can be preempted. This
can potentially cause the task which have preempted the task having the
local_lock to take the slow path of memcg charging.

However this behavior will only impact the performance if memcg charging
slowpath is worse than two context switches and possibly scheduling
delay behavior of current code. From the network intensive workload
experiment it does not seem like the case.

We ran varying number of netperf clients in different cgroups on a 72 CPU
machine for PREEMPT_RT config.

 $ netserver -6
 $ netperf -6 -H ::1 -l 60 -t TCP_SENDFILE -- -m 10K

number of clients | Without series | With series
  6               | 38559.1 Mbps   | 38652.6 Mbps
  12              | 37388.8 Mbps   | 37560.1 Mbps
  18              | 30707.5 Mbps   | 31378.3 Mbps
  24              | 25908.4 Mbps   | 26423.9 Mbps
  30              | 22347.7 Mbps   | 22326.5 Mbps
  36              | 20235.1 Mbps   | 20165.0 Mbps

We don't see any significant performance difference for the network
intensive workload with this series.

Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>

Reviewed-by: Vlastimil Babka <redacted>

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help