Re: RAID 5: low sequential write performance?
From: Corey Hickey <hidden>
Date: 2013-06-17 17:14:14
On 2013-06-17 07:22, Stan Hoeppner wrote:
On 6/17/2013 1:39 AM, Corey Hickey wrote:quoted
32768 seems to be the maximum for the stripe cache. I'm quite happy to spend 32 MB for this. 256 KB seems quite low, especially since it's only half the default chunk size.FULL STOP. Your stripe cache is consuming *384MB* of RAM, not 32MB. Check your actual memory consumption. The value plugged into stripe_cache_size is not a byte value. The value specifies the number of data elements in the stripe cache array. Each element is #disks*4KB in size. The formula for calculating memory consumed by the stripe cache is: (num_of_disks * 4KB) * stripe_cache_size In your case this would be (3 * 4KB) * 32768 = 384MB
I'm actually seeing a bit more memory difference: 401-402 MB when going from 256 to to 32768, on a mostly idle system, so maybe there's something else coming into play. Still your formula does make more sense. Apparently the idea of the value being KB is a common misconception, possibly perpetuated by this: https://raid.wiki.kernel.org/index.php/Performance --- # Set stripe-cache_size for RAID5. echo "Setting stripe_cache_size to 16 MiB for /dev/md3" echo 16384 > /sys/block/md3/md/stripe_cache_size --- Is 256 really a reasonable default? Given what I've been seeing, it appears that 256 is either unreasonably low or I have something else wrong.
quoted
mkfs.xfs /dev/m3 direct: 89.8 MB/s not direct: 90.0 MB/sYou didn't align XFS. Though with large streaming writes it won't matter much as md and the block layer will fill the stripes. However, XFS' big advantage is parallel IO and you're testing serial IO. Fire up 4 O_DIRECT threads/processes and compare to EXT4 w/4 write threads. The throughput gap will increase until you run out of hardware.
This will be something to test next time I rebuild my "real" array. Thanks, Corey