Re: How to prefer some devices over others in raid
From: Phil Turmel <hidden>
Date: 2014-01-01 20:21:50
On 01/01/2014 01:00 PM, Tomas M wrote:
quoted
Your initial post suggested you knew which drive was flaky. Now you indicate you don't know which, if any, is flaky. This suggests you have no idea why your array is slow.Well, I always have an indication which drive is flaky, based on dmesg output (e.g. hard resetting ATA3 link, etc). However, sometimes it reports that more than one drive has problems, and I can't be 100% sure which of the flaky drives is the "more flaky" one :) and it is too late to replace any of them, since there is high chance that the other one dies as well during resync (which happened to me few times already). From my point of view it is better for me to keep the array in sync as long as I can, and copy the data somewhere as fast as I can.
If you've experienced drive drops during resync a "few times already", and you don't say that such drives were obviously dead, it makes me suspicious that you are using non-enterprise drives. Using non-enterprise drive in any raid array can expose you to false failures from the timeout mismatch problem. If you care to share the output of "smartctl -x" for all of your drives, and "for x in /sys/block/*/device/timeout ; do echo $x $(< $x) ; done", we can immediately figure that out for you. If you want to understand the issue, search this list's archives for various combinations of "scterc", "URE", "timeout mismatch". You should also see if your distro has a cron job that performs a "check" scrub on your arrays for you. HTH, Phil