Re: kvm/arm64: Spark benchmark
From: Yu Zhao <hidden>
Date: 2023-06-18 20:11:55
Also in:
kvm, kvmarm, linux-arm-kernel, linux-doc, linux-mm, linux-trace-kernel, lkml
On Fri, Jun 9, 2023 at 7:04 AM Marc Zyngier [off-list ref] wrote:
On Fri, 09 Jun 2023 01:59:35 +0100, Yu Zhao [off-list ref] wrote:quoted
TLDR ==== Apache Spark spent 12% less time sorting four billion random integers twenty times (in ~4 hours) after this patchset [1].Why are the 3 architectures you have considered being evaluated with 3 different benchmarks?
I was hoping people having special interests in different archs might try to reproduce the benchmarks that I didn't report (but did cover) and see what happens.
I am not suspecting you to have cherry-picked the best results
I'm generally very conservative when reporting *synthetic* results.
For example, the same memcached benchmark used on powerpc yielded >50%
improvement on aarch64, because the default Ubuntu Kconfig uses 64KB
base page size for powerpc but 4KB for aarch64. (Before the series,
the reclaim (swap) path takes kvm->mmu_lock for *write* on O(nr of all
pages to consider); after the series, it becomes O(actual nr of pages
to swap), which is <10% given how the benchmark was set up.)
Ops/sec Avg. Latency p50 Latency p99 Latency p99.9 Latency
------------------------------------------------------------------------
Before 639511.40 0.09940 0.04700 0.27100 22.52700
After 974184.60 0.06471 0.04700 0.15900 3.75900
but I'd really like to see a variety of benchmarks that exercise this stuff differently.
I'd be happy to try other synthetic workloads that people think that are relatively representative. Also, I've backported the series and started an A/B experiment involving ~1 million devices (real-world workloads). We should have the preliminary results by the time I post the next version.