Thread (9 messages) 9 messages, 2 authors, 2011-12-12

Re: raid6 rebuild not starting

From: NeilBrown <hidden>
Date: 2011-12-12 06:24:53
Subsystem: software raid (multiple disks) support, the rest · Maintainers: Song Liu, Yu Kuai, Linus Torvalds

On Mon, 12 Dec 2011 08:02:33 +0200 Anssi Hannula [off-list ref] wrote:
On Mon, Dec 12, 2011 at 7:42 AM, NeilBrown [off-list ref] wrote:
quoted
On Mon, 12 Dec 2011 07:22:17 +0200 Anssi Hannula [off-list ref] wrote:
quoted
On Mon, Dec 12, 2011 at 5:01 AM, NeilBrown [off-list ref] wrote:
quoted
On Sun, 11 Dec 2011 09:03:14 +0200 Anssi Hannula [off-list ref] wrote:
quoted
Hi!

After I rebooted during a raid6 rebuild, the rebuild didn't start again.
Instead, there is a flood of "RAID conf printout"s that seemingly happen
on array activity.

All the devices show up properly in --detail and two devices are marked
as "spare rebuilding", and I can access the contents of the array just
fine, but the rebuild doesn't actually start. Is this a bug or am I
missing something? :)

I was initially on 2.6.38.8, but also tried 3.1.4 which seems to have
the same issue. mdadm is 3.1.5.

I'm not using start_ro and writing to the array doesn't trigger a
rebuild either.

Attached are --examine outputs before assembly, kernel log output on
assembly, /proc/mdstat and --detail after assembly (on 3.1.4).
Thank you for the very detailed problem report.
Thanks for the quick response :)
quoted
Unfortunately it is a complete mystery to me what is happening.

The repeated "RAID conf printout" messages are almost certainly coming from
the end of raid5_remove_disk.
It is being called from remove_and_add_spares for each of the two devices
that are being rebuilt.  raid5_remove_disk declines to remove them because it
can keep rebuilding them.

remove_and_add_spares then counts them and notes there are 2.
md_check_recovery notes that this is > 0, so it should create a thread to run
md_do_sync.

md_do_sync should then print out a message like
 md: recovery of RAID array md0

but it doesn't.  So something went wrong.
There are three reasons that md_do_sync might not print a message:

1/ MD_RECOVERY_DONE is set.  As only md_do_sync ever sets it, that is
   unlikely, and in any case md_check_recovery clears it.
2/ mddev->ro != 0.  It is only ever set to 0, 1, or 2.  If it is 1 or 2
  then we would be able to see that in /proc/mdstat as a "(readonly)"
  status.  But we don't.
3/ MD_RECOVERY_INTR is set. Again, md_check_recovery clears this.  It does
  get set if kthread_should_stop() returns 'true', but that should only
  happen if kthread_stop() was called.  That is only called by
  md_unregister_thread and I cannot see any way that could be call.

So.  No idea.

Are you compiling these kernels yourself?
Nope (used Mageia kernels), but I did now (3.1.5).
quoted
If so, could you:
 - put a printk in the top of md_do_sync to report the values of
  mddev->recovery and mddev->ro
 - print a message whenever md_unregister_thread is called
 - in md_check_recovery, in the
               if (mddev->ro) {
                       /* Only thing we do on a ro array is remove
                        * failed devices.
                        */
                       mdk_rdev_t *rdev;

 in statement, print the value of mddev->ro.

Then see which of those printk's fire, and what they tell us.
Only the last one does, and mddev->ro == 0.

For reference, attached is the used patch and resulting log output.
Thanks.

So it isn't running md_do_sync at all. Odd.

Could please add:
 - call "WARN_ON(1);" in print_raid5_conf() so we get a stack trace and can
   see who is calling it.
 - print the value that remove_and_add_spares is going to return.
Attached. As you can see, remove_and_add_spare returns 0.

--
Anssi Hannula

Please add:
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 5c95ccb..fa56ac5 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -7328,8 +7328,10 @@ static int remove_and_add_spares(mddev_t *mddev)
 			}
 		}
 
+	printk("degraded=%d\n", mddev->degraded);
 	if (mddev->degraded) {
 		list_for_each_entry(rdev, &mddev->disks, same_set) {
+			printk("raid_disk=%d flags=%x\n", rdev->raid_disk, rdev->flags);
 			if (rdev->raid_disk >= 0 &&
 			    !test_bit(In_sync, &rdev->flags) &&
 			    !test_bit(Faulty, &rdev->flags))

'degraded' must be 2 as dmesg contains

[   45.544806] md/raid:md0: raid level 6 active with 8 out of 10 devices, algorithm 2

and 'degraded' is exactly the difference between '8' and '10' there.

raid disks 3 and 7 must have In_sync and Faulty clear as both of them just
show "spare rebuilding" in the 'detail' output.

so remove_and_add_spares "must" return 2.

Hopefully the above patch will help me understand which of those is wrong.

NeilBrown

Attachments

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help