Re: [PATCH -next 0/3] md/raid10: reduce lock contention for io
From: Yu Kuai <hidden>
Date: 2022-08-30 01:09:58
Also in:
lkml
Hi, Paul! 在 2022/08/29 21:58, Paul Menzel 写道:
Dear Yu, Thank you for your patches. Am 29.08.22 um 15:14 schrieb Yu Kuai:quoted
From: Yu Kuai <redacted> patch 1 is a small problem found by code review. patch 2 avoid holding resync_lock in fast path. patch 3 avoid holding lock in wake_up() in fast path. Test environment: Architecture: aarch64 Cpu: Huawei KUNPENG 920, there are four numa nodes Raid10 initialize: mdadm --create /dev/md0 --level 10 --bitmap none --raid-devices 4 /dev/nvme0n1 /dev/nvme1n1 /dev/nvme2n1 /dev/nvme3n1 Test cmd: fio -name=0 -ioengine=libaio -direct=1 -group_reporting=1 -randseed=2022 -rwmixread=70 -refill_buffers -filename=/dev/md0 -numjobs=16 -runtime=60s -bs=4k -iodepth=256 -rw=randread Test result: before this patchset: 2.9 GiB/s after this patchset: 6.6 Gib/sCould you please give more details about the test setup, like the drives used?
test setup is described above, four nvme disks is used.
Did you use some tools like ftrace to figure out the bottleneck?
Yes, I'm sure the bottleneck is spin_lock(), specifically threads from multiple nodes try to grab the same lock. By the way, if I bind the threads to the same node, performance can also improve to 6.6 Gib/s without this patchset. Thanks, Kuai
quoted
Please noted that in kunpeng-920, memory access latency is very bad accross nodes compare to local node, and in other architecture performance improvement might not be significant. Yu Kuai (3): md/raid10: fix improper BUG_ON() in raise_barrier() md/raid10: convert resync_lock to use seqlock md/raid10: prevent unnecessary calls to wake_up() in fast path drivers/md/raid10.c | 88 +++++++++++++++++++++++++++++---------------- drivers/md/raid10.h | 2 +- 2 files changed, 59 insertions(+), 31 deletions(-)Kind regards, Paul .