Thread (9 messages) 9 messages, 5 authors, 2011-04-09

Re: 4-disk raid5 with 2 disks going bad: best way to proceed?

From: Roberto Spadim <hidden>
Date: 2011-04-07 03:35:04

i'm not a master expert with linux raid, but... stop array, add 2 new
disks and make a backup (dd) of two bad disks, start array with the
new disks (must be same size)
(i don't know if raid5 have spare disks and how they work, maybe
there's a 'online' solution without stopping array)
maybe others guys here could help you better, but this one works :)

2011/4/6 rob pfile [off-list ref]:
Hi all,

any collective wisdom on what to do here? i've got a 4-disk raid5, and the most recent checkarray showed several bad blocks caused by uncorrectable read errors on two of the disks in the array. both disks in question show 0 reallocated sectors, but one looks like this:

197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       16
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       16

and the other like this:

197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       14
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       3
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       11

i'm a bit worried about failing one of these disks for fear that the other might give uncorrectable read errors during the rebuild. if i *had* to choose one, should i choose the one with all the pending sectors, or the one with all the uncorrectable sectors?

does it makes sense to do a smartctl -t offline scan on one or both of these disks first?

i guess i could take the array offline, clone one of the disks with dd, and then swap the clone in. but... is there a way to clone one disk in the array using mdadm? in other words, is there a way to construct a clean copy of one of the disks even if there are raid-correctable read errors?

i do have backups, so perhaps it will not kill me if the array dies, but i'd like to tread carefully and try and get out of this mess without nuking everything.

thanks for any advice,

rob














--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 
Roberto Spadim
Spadim Technology / SPAEmpresarial
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help