Re: Growing raid 5 to 6; /proc/mdstat reports a strange value?
From: Neil Brown <hidden>
Date: 2010-02-10 02:12:55
Subsystem:
software raid (multiple disks) support, the rest · Maintainers:
Song Liu, Yu Kuai, Linus Torvalds
On Fri, 29 Jan 2010 23:23:34 +1100 Neil Brown [off-list ref] wrote:
On Sun, 24 Jan 2010 19:49:31 -0800 Michael Evans [off-list ref] wrote:quoted
mdX : active raid5 sdd1[8](S) sdb1[7](S) sdf8[0] sdl8[4] sdk2[5] sdc1[6] sdj6[3] sdi8[1] Y blocks super 1.1 level 5, 128k chunk, algorithm 2 [6/6] [UUUUUU] # mdadm --grow /dev/mdX --level=6 --raid-devices=8 --backup-file=/root/mdX.backupfile mdX : active raid6 sdd1[8] sdb1[7] sdf8[0] sdl8[4] sdk2[5] sdc1[6] sdj6[3] sdi8[1] Y blocks super 1.1 level 6, 128k chunk, algorithm 18 [8/9] [UUUUUU_U] [>....................] reshape = 0.0% (33920/484971520) finish=952.6min speed=8480K/sec !!! mdadm 3.1.1 I wanted an 8 device raid-6; Why do you show 9?That is weird isn't it. It is showing that 8 devices are in the array, of which 9 are working. That cannot be right. More worrying is that the second last device claim to not be present, which doesn't seem right.
The second last device being missing is actually correct. The '_' doesn't
actually mean "missing" but just "not completely in-sync".
When you converted from RAID5 to RAID6 it added the 7th device which clearly
was not in-sync.
Then converting to an 8-device array added the 8th device, but as the array
was not expecting any data to be on this device it is by definition
in-sync and so represented by "U".
The only problem is that it says "9" are in-sync where it should say "7"
are.
The following patch, which I have submitted upstream, fixes this.
Thanks again for the report.
NeilBrown
Author: NeilBrown [off-list ref]
Date: Tue Feb 9 12:31:47 2010 +1100
md: fix 'degraded' calculation when starting a reshape.
This code was written long ago when it was not possible to
reshape a degraded array. Now it is so the current level of
degraded-ness needs to be taken in to account. Also newly addded
devices should only reduce degradedness if they are deemed to be
in-sync.
In particular, if you convert a RAID5 to a RAID6, and increase the
number of devices at the same time, then the 5->6 conversion will
make the array degraded so the current code will produce a wrong
value for 'degraded' - "-1" to be precise.
If the reshape runs to completion end_reshape will calculate a correct
new value for 'degraded', but if a device fails during the reshape an
incorrect decision might be made based on the incorrect value of
"degraded".
This patch is suitable for 2.6.32-stable and if they are still open,
2.6.31-stable and 2.6.30-stable as well.
Cc: stable@kernel.org
Reported-by: Michael Evans [off-list ref]
Signed-off-by: NeilBrown [off-list ref]
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index e84204e..b5629c3 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c@@ -5464,11 +5464,11 @@ static int raid5_start_reshape(mddev_t *mddev) !test_bit(Faulty, &rdev->flags)) { if (raid5_add_disk(mddev, rdev) == 0) { char nm[20]; - if (rdev->raid_disk >= conf->previous_raid_disks) + if (rdev->raid_disk >= conf->previous_raid_disks) { set_bit(In_sync, &rdev->flags); - else + added_devices++; + } else rdev->recovery_offset = 0; - added_devices++; sprintf(nm, "rd%d", rdev->raid_disk); if (sysfs_create_link(&mddev->kobj, &rdev->kobj, nm))
@@ -5480,9 +5480,12 @@ static int raid5_start_reshape(mddev_t *mddev) break; } + /* When a reshape changes the number of devices, ->degraded + * is measured against the large of the pre and post number of + * devices.*/ if (mddev->delta_disks > 0) { spin_lock_irqsave(&conf->device_lock, flags); - mddev->degraded = (conf->raid_disks - conf->previous_raid_disks) + mddev->degraded += (conf->raid_disks - conf->previous_raid_disks) - added_devices; spin_unlock_irqrestore(&conf->device_lock, flags); }