Re: How to avoid complete rebuild of RAID 6 array (6/8 active devices)
From: Matthias Urlichs <hidden>
Date: 2008-07-14 22:58:16
Hi, David Greaves:
I've found that once a disk starts to go bad there is a very strong tendency for it to continue to deteriorate.
In my experience, that's true for older disks, but not necessarily for
those that are new and simply have a spot or two where the magnetizable
layer is a wee bit too thin.
However, even if they do in fact continue to deteriorate, the ability to
re-map the offending areas and continue gives me an order of magnitude
more time to deal with the problem.
In fact, as I said, there may be problems lurking on other disks which I
just haven't found yet (how often do you read all 5TB of your data?),
which means that a feature like this is the difference between being
able to recover and certain data loss, RAID-6 nonwithstanding.
NB, one other problem I've observed (older kernel, I don't know if it's
been fixed) is that a resync is restarted from the beginning when a
fault on a second disk is encountered. BAD idea.
NB2, my ideal RAID recovery scenario looks like this:
* When a disk access fails, the offender is switched to write-only mode.
I.e., the kernel ignores it when reading, but still tries to write
correct data when something's updated.
* In order to re-sync a new disk, simply duplicate the old one if it
hasn't been removed yet; of course, you need to do "real" recovery for
the bad spots, and you need the aforementioned write-only code to
update both (when writing to the area that's already synced up).
The _huge_ advantage of this process would be that a re-sync does not
affect the array's read performance at all (other than the higher CPU
usage). For some people, that can be quite important.
Now where can I get the largish chunk of time required to implement all
of this ... oh well.
--
Matthias Urlichs | {M:U} IT Design @ m-u-it.de | smurf@smurf.noris.de
Disclaimer: The quote was selected randomly. Really. | http://smurf.noris.de
- -
The way to a man's heart is through the left ventricle.