Thread (8 messages) 8 messages, 4 authors, 2014-01-02

Re: How to prefer some devices over others in raid

From: Phil Turmel <hidden>
Date: 2014-01-01 20:21:50

On 01/01/2014 01:00 PM, Tomas M wrote:
quoted
Your initial post suggested you knew which drive was flaky.  Now you
indicate you don't know which, if any, is flaky.  This suggests you have
no idea why your array is slow.
Well, I always have an indication which drive is flaky, based on dmesg
output (e.g. hard resetting ATA3 link, etc). However, sometimes it
reports that more than one drive has problems, and I can't be 100%
sure which of the flaky drives is the "more flaky" one :) and it is
too late to replace any of them, since there is high chance that the
other one dies as well during resync (which happened to me few times
already). From my point of view it is better for me to keep the array
in sync as long as I can, and copy the data somewhere as fast as I
can.
If you've experienced drive drops during resync a "few times already",
and you don't say that such drives were obviously dead, it makes me
suspicious that you are using non-enterprise drives.

Using non-enterprise drive in any raid array can expose you to false
failures from the timeout mismatch problem.  If you care to share the
output of "smartctl -x" for all of your drives, and "for x in
/sys/block/*/device/timeout ; do echo $x $(< $x) ; done", we can
immediately figure that out for you.

If you want to understand the issue, search this list's archives for
various combinations of "scterc", "URE", "timeout mismatch".  You should
also see if your distro has a cron job that performs a "check" scrub on
your arrays for you.

HTH,

Phil
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help