Thread (49 messages) 49 messages, 8 authors, 2023-09-25

Re: md_raid: mdX_raid6 looping after sync_action "check" to "idle" transition

From: Guoqing Jiang <hidden>
Date: 2021-01-26 14:10:20
Also in: lkml


On 1/26/21 13:58, Donald Buczek wrote:
quoted
Hmm, how about wake the waiter up in the while loop of raid5d?
@@ -6520,6 +6532,11 @@ static void raid5d(struct md_thread *thread)
                         md_check_recovery(mddev);
                         spin_lock_irq(&conf->device_lock);
                 }
+
+               if ((atomic_read(&conf->active_stripes)
+                    < (conf->max_nr_stripes * 3 / 4) ||
+                    (test_bit(MD_RECOVERY_INTR, &mddev->recovery))))
+                       wake_up(&conf->wait_for_stripe);
         }
         pr_debug("%d stripes handled\n", handled);
Hmm... With this patch on top of your other one, we still have the basic 
symptoms (md3_raid6 busy looping), but the sync thread is now hanging at

     root@sloth:~# cat /proc/$(pgrep md3_resync)/stack
     [<0>] md_do_sync.cold+0x8ec/0x97c
     [<0>] md_thread+0xab/0x160
     [<0>] kthread+0x11b/0x140
     [<0>] ret_from_fork+0x22/0x30

instead, which is 
https://elixir.bootlin.com/linux/latest/source/drivers/md/md.c#L8963
Not sure why recovery_active is not zero, because it is set 0 before 
blk_start_plug, and raid5_sync_request returns 0 and skipped is also set 
to 1. Perhaps handle_stripe calls md_done_sync.

Could you double check the value of recovery_active? Or just don't wait 
if resync thread is interrupted.

wait_event(mddev->recovery_wait,
	   test_bit(MD_RECOVERY_INTR,&mddev->recovery) ||
	   !atomic_read(&mddev->recovery_active));
And, unlike before, "md: md3: data-check interrupted." from the pr_info 
two lines above appears in dmesg.
Yes, that is intentional since MD_RECOVERY_INTR is set by write idle.

Anyway, will try the script and investigate more about the issue.

Thanks,
Guoqing
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help