Re: mdcheck: slow system issues
From: Phil Turmel <hidden>
Date: 2020-03-31 12:14:22
On 3/31/20 6:53 AM, Peter Grandi wrote:
quoted
Dear Linux folks, When `mdcheck` runs on two 100 TB software RAIDs our users complain about being unable to open files in a reasonable time. [...] 109394518016 blocks super 1.2 level 6, 512k chunk, algorithm 2 [16/16] [UUUUUUUUUUUUUUUU]Unsurprisingly it is a 16-wide RAID6 of 8TB HDDs.
With a 512k chunk. Definitely not suitable for anything but large media file streaming.
quoted
[...] The article *Software RAID check - slow system issues* [1] recommends to lower `dev.raid.speed_limit_max`, but the RAID should easily be able to do 200 MB/s as our tests show over 600 MB/s during some benchmarks.Many people have to find out the hard way that on HDDs sequential and random IO rates differ by "up to" two orders of magnitude, and that RAID6 gives an "interesting" tradeoff between read and write speed with random vs. sequential access.
The random/streaming threshold is proportional to the address stride on one device--the raid sector number gap between one chunk and the next chunk on that (approximately). Which is basically chunk * (n-2). With so many member devices, the transition from random-access performance and streaming performance requires that much larger accesses. I configure any raid6 that might have some random loads with a 16k or 32k chunk size. Finally, the stripe cache size should be optimized on the system in question. More is generally better, unless it starves the OS of buffers. Adjust and test, with real loads.
quoted
How do you run `mdcheck` in production without noticeably affecting the system?Fortunately the only solution that works well is quite simple: replace the storage system with one with much increased IOPS-per-TB (that is SSDs or much smaller HDDs, 1TB or less) *and* switch from RAID6 to RAID10.
These are good choices too, though not cheap. Phil