Re: recovering RAID5 from multiple disk failures
From: Chris Murphy <hidden>
Date: 2013-02-03 00:39:33
On Feb 2, 2013, at 5:23 PM, Phil Turmel [off-list ref] wrote:
I do disagree. The above, combined with:quoted
I do know where the bad sectors are from the ddrescue report. We are talking about less that 50kB bad data on disk1. Unfortunately, disk3 is worse, but there is no sector that is bad on both disks.Leads me to recommend "mdadm --create --assume-clean" using the original drives, taking care to specify the devices in the proper order (per their "Raid Device" number in the --examine reports). I still haven't seen any data that definitively links specific serial numbers to specific raid device numbers. Please do that. After re-creating the array, and setting all the drive timeouts to 7.0 seconds, issue a "check" scrub: echo "check" >/sys/block/md0/md/sync_action This should clean up the few pending sectors on disk #1 by reconstruction from the others, and may very well do the same for disk #3. If disk #3 gets kicked out at this point, assemble in degraded mode with disk #2, #4, and a fresh copy of disk #1 (picking up the new superblock and any fixes during the partial scrub). Then "--add" a spare (wiped) disk and let the array rebuild. And grab your data.
OK I understand. This seems reasonable to me as well. It is very important to get *each* drive's SCT ERC's set before starting the check! So basically disk1 being out of sync in this instance is considered minimal, and worth taking a chance on in order to avoid losing the 50kb of data affected by bad sectors; because they may be all the difference in easily getting the array up, mounted, and the data off the disk. Chris Murphy