RE: [RFC PATCH v6 3/4] scheduler: scan idle cpu in cluster for tasks within one LLC | linux-arm-kernel

(off-list ancestor, not in this archive)

-----Original Message-----
From: Dietmar Eggemann [mailto:dietmar.eggemann@arm.com]
Sent: Thursday, May 6, 2021 12:30 AM
To: Song Bao Hua (Barry Song) <redacted>; Vincent Guittot
[off-list ref]
Cc: tim.c.chen@linux.intel.com; catalin.marinas@arm.com; will@kernel.org;
rjw@rjwysocki.net; bp@alien8.de; tglx@linutronix.de; mingo@redhat.com;
lenb@kernel.org; peterz@infradead.org; rostedt@goodmis.org;
bsegall@google.com; mgorman@suse.de; msys.mizuma@gmail.com;
valentin.schneider@arm.com; gregkh@linuxfoundation.org; Jonathan Cameron
[off-list ref]; juri.lelli@redhat.com; mark.rutland@arm.com;
sudeep.holla@arm.com; aubrey.li@linux.intel.com;
linux-arm-kernel@lists.infradead.org; linux-kernel@vger.kernel.org;
linux-acpi@vger.kernel.org; x86@kernel.org; xuwei (O) [off-list ref];
Zengtao (B) [off-list ref]; guodong.xu@linaro.org; yangyicong
[off-list ref]; Liguozhu (Kenneth) [off-list ref];
linuxarm@openeuler.org; hpa@zytor.com
Subject: Re: [RFC PATCH v6 3/4] scheduler: scan idle cpu in cluster for tasks
within one LLC

On 03/05/2021 13:35, Song Bao Hua (Barry Song) wrote:

[...]

From: Song Bao Hua (Barry Song)
[...]

From: Dietmar Eggemann [mailto:dietmar.eggemann@arm.com]
[...]

On 29/04/2021 00:41, Song Bao Hua (Barry Song) wrote:

-----Original Message-----
From: Dietmar Eggemann [mailto:dietmar.eggemann@arm.com]
[...]

From: Dietmar Eggemann [mailto:dietmar.eggemann@arm.com]
[...]

On 20/04/2021 02:18, Barry Song wrote:
[...]

On the other hand, according to "sched: Implement smarter wake-affine logic"
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/
?id=62470419
Proper factor in wake_wide is mainly beneficial of 1:n tasks like
postgresql/pgbench.
So using the smaller cluster size as factor might help make wake_affine false
so
improve pgbench.

From the commit log, while clients =  2*cpus, the commit made the biggest
improvement. In my case, It should be clients=48 for a machine whose LLC
size is 24.

In Linux, I created a 240MB database and ran "pgbench -c 48 -S -T 20 pgbench"
under two different scenarios:
1. page cache always hit, so no real I/O for database read
2. echo 3 > /proc/sys/vm/drop_caches

For case 1, using cluster_size and using llc_size will result in similar
tps= ~108000, all of 24 cpus have 100% cpu utilization.

For case 2, using llc_size still shows better performance.

tps for each test round(cluster size as factor in wake_wide):
1398.450887 1275.020401 1632.542437 1412.241627 1611.095692 1381.354294
1539.877146
avg tps = 1464

tps for each test round(llc size as factor in wake_wide):
1718.402983 1443.169823 1502.353823 1607.415861 1597.396924 1745.651814
1876.802168
avg tps = 1641  (+12%)

so it seems using cluster_size as factor in "slave >= factor && master >=
slave *
factor" isn't a good choice for my machine at least.
So SD size = 4 (instead of 24) seems to be too small for `-c 48`.

Just curious, have you seen the benefit of using wake wide on SD size =
24 (LLC) compared to not using it at all?
At least in my benchmark made today, I have not seen any benefit to use
llc_size. Always returning 0 in wake_wide() seems to be much better.

postgres@ubuntu:$pgbench -i pgbench
postgres@pgbench:$ pgbench -T 120 -c 48 pgbench

using llc_size, it got to 123tps
always returning 0 in wake_wide(), it got to 158tps

actually, I really couldn't reproduce the performance improvement
the commit "sched: Implement smarter wake-affine logic" mentioned.
on the other hand, the commit log didn't present the pgbench command
parameter used. I guess the benchmark result will highly depend on
the command parameter and disk I/O speed.

Thanks
Barry

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help

Possibly related (same subject, not in this thread)