Re: RAID grow and disk failure
From: Neil Brown <hidden>
Date: 2010-06-28 23:49:13
On Sat, 26 Jun 2010 15:12:35 +0200 Piergiorgio Sartor [off-list ref] wrote:
Hi,quoted
Assuming the code doesn't have any bugs, the reshape will stop, then immediately restart picking up where it left off.thanks, that's what I wanted to know.quoted
You will of course end up with a degraded arrayYes, that was clear.quoted
It might be nice in these circumstances to abort the reshape and revert back the the previous number of devices - particularly if it was the new device that failed. However that currently isn't supported.Well, probably as an option, it could be interesting. Actually, I would be still interested, we already discussed the topic, on a RAID-5/6 with HDDs of different size. This would simplify many things...quoted
quoted
1) mdadm --grow ... mdadm --wait pvresizeYes.quoted
2) mdadm --grow pvresizeNo. Until the reshape has completed, the extra space is not available.There seem to be an issue, here, maybe. Using the command line: mdadm --grow /dev/md/vol02 --bitmap=none; mdadm --grow /dev/md/vol02 -n 9 --backup-file=/var/tmp/md125.backup; mdadm --wait /dev/md/vol02; mdadm --grow /dev/md/vol02 --bitmap=internal --bitmap-chunk=128 Note that /dev/md/vol02 is the usual link to /dev/md125, which should be the same for this scope, I guess. I got (in two independent tests): mdadm: Need to backup 2688K of critical section.. mdadm: failed to set internal bitmap. Re-issuing: mdadm --wait /dev/md/vol02; mdadm --grow /dev/md/vol02 --bitmap=internal --bitmap-chunk=128 Does wait. Could it be the devices (being USB) are so slow that some race condition is uncovered and the immediate "--wait" after the "--grow" does not work?
Yes, there is a race here. The reshape doesn't quite start instantly and so --wait doesn't notice. I've added a note to my todo-list to look into this. For now, a 'sleep 1' between the --grow and the --wait should be enough. Thanks, NeilBrown