Re: [Question] Fail event during reshape raid5 -> raid6

From: NeilBrown <hidden>
Date: 2013-09-04 06:13:40

On Tue, 03 Sep 2013 09:20:00 +0200 Roland 'ValiDOM' Jungnickel
[off-list ref] wrote:

hi!

I started reshaping a 3-device raid5 to a 4-device raid6. After some 
hours, I got a fail event on a harddrive which was already used in the 
raid5.

There are two different raid-groups on these harddrives, a reshape of 
the second one has not yet started.

What would you suggest?
* stop the reshape, go back to raid5 and start rebuild to the new disk 
(how?)
* just wait, hope and pray... ( there is no backup, I do not have 
harddrives to backup to as this is just to much...)

md0 : active raid6 sdc2[0] sde2[4](F) sdb2[3] sdd2[1]
       2858420224 blocks super 0.91 level 6, 64k chunk, algorithm 18 
[4/2] [UU__]
       [=>...................]  reshape =  7.8% (112662528/1429210112) 
finish=4938.2min speed=4442K/sec

md1 : active raid5 sdc1[0] sde1[2] sdb1[3](S) sdd1[1]
       1048575744 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU]

(md1 not mounted, sde1 will fail once used/checked and the raid5 will 
automatically start rebuild on sdb1)

I would probably add the new good drive to the array (after some basic
testing to ensure that it really is good).  Then stop the array and restart
it.
I think it will recovery and reshape at the same time though that might
depend on what kernel you are using.
Testing with loop-back devices should be fairly easy to confirm what will
happen.

mdadm-3.3 (just released) can reverse a reshape for you.
   mdadm --assemble /dev/md0 --update=revert-reshape /dev/sd[dbc]2
   --backup=/whatever

NeilBrown

Attachments

signature.asc [application/pgp-signature] 828 bytes

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help