Re: [md PATCH 5/6] md: allow number of drives in raid5 to be reduced
From: NeilBrown <hidden>
Date: 2009-03-27 19:39:43
On Sat, March 28, 2009 3:19 am, Andre Noll wrote:
On 19:53, NeilBrown wrote:quoted
Never allow the size to be reduced below the minimum (4 doe raid6, 3 otherwise).doe?
Left hand was one column too far left. d->f e->r. Fixed.
quoted
@@ -3723,6 +3723,7 @@ static sector_t reshape_request(mddev_t *mddev,sector_t sector_nr, int *skipped int i; int dd_idx; sector_t writepos, safepos, gap; + sector_t stripe_addr; if (sector_nr == 0) { /* If restarting in the middle, skip the initial sectors */@@ -3779,10 +3780,21 @@ static sector_t reshape_request(mddev_t *mddev,sector_t sector_nr, int *skipped wake_up(&conf->wait_for_overlap); } + if (mddev->delta_disks < 0) { + BUG_ON(conf->reshape_progress == 0); + stripe_addr = writepos; + BUG_ON((mddev->dev_sectors & + ~((sector_t)mddev->chunk_size / 512 - 1)) + - (conf->chunk_size / 512) - stripe_addr + != sector_nr); + } else { + BUG_ON(writepos != sector_nr + conf->chunk_size / 512); + stripe_addr = writepos; + }What's the point of the new stripe_addr variable? Isn't it equal to writepos in any case?
Yes, but it shouldn't be. In the second branch of the if, it should be
stripe_addr = sector_nr;
as I discovered yesterday while testing.
quoted
@@ -4738,14 +4755,25 @@ static int raid5_check_reshape(mddev_t *mddev) raid5_conf_t *conf = mddev_to_conf(mddev); int err; - if (mddev->delta_disks < 0 || - mddev->new_level != mddev->level) - return -EINVAL; /* Cannot shrink array or change level yet */ if (mddev->delta_disks == 0) return 0; /* nothing to do */ if (mddev->bitmap) /* Cannot grow a bitmap yet */ return -EBUSY; + if (mddev->degraded > conf->max_degraded) + return -EINVAL; + if (mddev->delta_disks < 0) { + /* We might be able to shrink, but the devices must + * be made bigger first. + * For raid6, 4 is the minimum size. + * Otherwise 2 is the minimum + */ + int min = 2; + if (mddev->level == 6) + min = 4; + if (mddev->raid_disks + mddev->delta_disks < min) + return -EINVAL; + }Hm, this code doesn't seem to do what the comment suggests. IMHO, we must check that (a) (raid_disks + delta_disks) * sizeof(smallest) is big enough and (b) raid_disks + delta_disks >= minimal admissible number of disks The comment says the devices must be made bigger (to satisfy (a)) but the code only checks (b).
Yes, the comment could probably be improved. The check for 'a' happens later when start_reshape is called.
quoted
@@ -4862,21 +4905,38 @@ static void raid5_finish_reshape(mddev_t *mddev) if (!test_bit(MD_RECOVERY_INTR, &mddev->recovery)) { conf->previous_raid_disks = conf->raid_disks; - md_set_array_sectors(mddev, raid5_size(mddev, 0, 0)); - set_capacity(mddev->gendisk, mddev->array_sectors); - mddev->changed = 1; - - bdev = bdget_disk(mddev->gendisk, 0); - if (bdev) { - mutex_lock(&bdev->bd_inode->i_mutex); - i_size_write(bdev->bd_inode, - (loff_t)mddev->array_sectors << 9); - mutex_unlock(&bdev->bd_inode->i_mutex); - bdput(bdev); - } spin_lock_irq(&conf->device_lock); conf->reshape_progress = MaxSector; spin_unlock_irq(&conf->device_lock); + if (mddev->delta_disks > 0) { + conf->previous_raid_disks = conf->raid_disks;previous_raid_disks was already assigned earlier, so this can go away, imo.
Thanks. It is actually the first one that needs to go. The 'else' branch of that 'if' needs to have access to the original value of previous_raid_disks. Thanks a lot for the review. NeilBrown