Thread (5 messages) 5 messages, 3 authors, 2013-07-17

Re: MD RAID 1 fail/remove/add corruption in 3.10

From: NeilBrown <hidden>
Date: 2013-07-17 04:52:30

On Tue, 16 Jul 2013 14:49:20 -0400 Joe Lawrence [off-list ref]
wrote:
Hi Neil, Martin,

While testing patches to fix RAID1 repair GPF crash w/3.10-rc7
( http://thread.gmane.org/gmane.linux.raid/43351 ), I encountered disk
corruption when repeatedly failing, removing, and adding MD RAID1
component disks to their array.  The RAID1 was created with an internal
write bitmap and the test was run against alternating disks in the
set.  I bisected this behavior back to commit 7ceb17e8 "md: Allow
devices to be re-added to a read-only array", specifically these lines
of code:

remove_and_add_spares:

+		if (rdev->saved_raid_disk >= 0 && mddev->in_sync) {
+			spin_lock_irq(&mddev->write_lock);
+			if (mddev->in_sync)
+				/* OK, this device, which is in_sync,
+				 * will definitely be noticed before
+				 * the next write, so recovery isn't
+				 * needed.
+				 */
+				rdev->recovery_offset = mddev->recovery_cp;
+			spin_unlock_irq(&mddev->write_lock);
+		}
+		if (mddev->ro && rdev->recovery_offset != MaxSector)
+			/* not safe to add this disk now */
+			continue;

when I #ifdef 0 these lines out, leaving rdev->recovery_offset = 0,
then my tests run without incident.

If there is any instrumentation I can apply to remove_and_add_spares
I'll be happy to gather more data.  I'll send an attached copy of
my test programs in a reply so this mail doesn't get bounced by
any spam filters.
Thanks for the report Joe.

That code has problems.

If the array has a bitmap, then 'saved_raid_disk >= 0' means that the device
is fairly close, but the bitmap based resync is required first.  This code
skips that.

If the array does not have a bitmap, then 'saved_raid_disk >= 0' means that
this device was exactly right for this slot before, but there is no locking
to prevent updates going to the array between then super_1_validate checked
the event count, and when remove_and_add_spares tried to add it.

I suspect I should  just rip that code out and go back to the drawing board.

NeilBrown

Attachments

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help