Thread (9 messages) 9 messages, 3 authors, 2013-02-14

Re: Set disk faulty / hot disk remove ioctl bug for read-only MD?

From: NeilBrown <hidden>
Date: 2013-02-13 02:38:20
Subsystem: software raid (multiple disks) support, the rest · Maintainers: Song Liu, Yu Kuai, Linus Torvalds

On Tue, 12 Feb 2013 16:05:18 -0500 (EST) Joe Lawrence
[off-list ref] wrote:
Hi Neil,

I believe I found a bug when failing and removing component devices from 
read-only and auto-read-only MD RAID devices.
Thanks for reporting them!
Prior to commit 1ca69c4b "md: avoid taking the mutex on some ioctls", when
failing a read-only MD device component via mdadm --fail, EROFS was
returned.  Failing (and removing) auto-read-only components was permitted.
I wonder what we really want here....
Given that a read error on a read-only array will mark the device as faulty,
it seems fair to allow "--fail" to also mark a device as faulty on a
read-only array.
Do you have a particular preference for having "mdadm --fail" fail with EROFS
on a readonly array?

After the change, MD allows the failure of component devices for both
read-only and auto-read-only RAID devices.  Running mdadm --remove will
return EROFS when attempted on a read-only device.
I suspect this is preferred behaviour.
It gets a little interesting when trying to remove a component from an
auto-read-only MD.  The *first* attempt will fail with EBUSY as the rdev
was left in the Blocked state by the SET_DISK_FAULTY ioctl.  However,
because the HOT_REMOVE_DISK ioctl made it through to the "switch to rw mode
if started auto-readonly" code, it has been transitioned to read-write. 
Subsequent trips through the device's mddev->thread may clear the Blocked
flag via md_update_sb (should the reconfig_mutex be available).  A second or
third attempt at removing the failed component from the auto-read-only MD
would then succeed.
This definitely a bit odd.
In general "--remove" can fail with EBUSY for a short while after the
failure, but this happens after an arbitrary delay.

I think we just want to allow the "set to read-write" process to complete
cleanly before trying the actual removal.  Patch below seems to work for me.
Can you confirm please?

Thanks,
NeilBrown

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 8b557d2..292cc2f 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -6529,7 +6529,17 @@ static int md_ioctl(struct block_device *bdev, fmode_t mode,
 			mddev->ro = 0;
 			sysfs_notify_dirent_safe(mddev->sysfs_state);
 			set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
-			md_wakeup_thread(mddev->thread);
+			/* mddev_unlock will wake thread */
+			/* If a device failed while we were read-only, we
+			 * need to make sure the metadata is updated now.
+			 */
+			if (test_bit(MD_CHANGE_DEVS, &mddev->flags)) {
+				mddev_unlock(mddev);
+				wait_event(mddev->sb_wait,
+					   !test_bit(MD_CHANGE_DEVS, &mddev->flags) &&
+					   !test_bit(MD_CHANGE_PENDING, &mddev->flags));
+				mddev_lock(mddev);
+			}
 		} else {
 			err = -EROFS;
 			goto abort_unlock;

Attachments

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help