Thread (33 messages) 33 messages, 2 authors, 2020-07-08

Re: Assemblin journaled array fails

From: Song Liu <song@kernel.org>
Date: 2020-06-23 23:13:00

On Tue, Jun 23, 2020 at 6:17 AM Michal Soltys [off-list ref] wrote:
On 6/22/20 6:37 PM, Song Liu wrote:
quoted
quoted
quoted
Thanks for the trace. Looks like we may have some issues with
MD_SB_CHANGE_PENDING.
Could you please try the attached patch?
Should I run this along with pr_debugs from the previous patch enabled ?
We don't need those pr_debug() here.

Thanks,
Song
So with this patch attached, there is no extra output whatsoever - once it finished getting past this point:

[  +0.371752] r5c_recovery_rewrite_data_only_stripes rewritten 20001 stripes to the journal, current ctx->pos 408461384 ctx->seq 866603361
[  +0.395000] r5c_recovery_rewrite_data_only_stripes rewritten 21001 stripes to the journal, current ctx->pos 408479568 ctx->seq 866604361
[  +0.371255] r5c_recovery_rewrite_data_only_stripes rewritten 22001 stripes to the journal, current ctx->pos 408496600 ctx->seq 866605361
[  +0.401013] r5c_recovery_rewrite_data_only_stripes rewritten 23001 stripes to the journal, current ctx->pos 408515472 ctx->seq 866606361
[  +0.370543] r5c_recovery_rewrite_data_only_stripes rewritten 24001 stripes to the journal, current ctx->pos 408532112 ctx->seq 866607361
[  +0.319253] r5c_recovery_rewrite_data_only_stripes done
[  +0.061560] r5c_recovery_flush_data_only_stripes enter
[  +0.075697] r5c_recovery_flush_data_only_stripes before wait_event

That is, besides 'task <....> blocked for' traces or unless pr_debug()s were enabled.

There were a few 'md_write_start set MD_SB_CHANGE_PENDING' *before* that (all of them likely related to another raid that is active at the moment, as these were happening during that lengthy r5c_recovery_flush_log() process).
Hmm.. this is weird, as I think I marked every instance of set_bit
MD_SB_CHANGE_PENDING.
Would you mind confirm those are to the other array with something like:
diff --git i/drivers/md/md.c w/drivers/md/md.c
index dbbc8a50e2ed2..e91acfdcec032 100644
--- i/drivers/md/md.c
+++ w/drivers/md/md.c
@@ -8480,7 +8480,7 @@ bool md_write_start(struct mddev *mddev, struct bio *bi)
                        mddev->in_sync = 0;
                        set_bit(MD_SB_CHANGE_CLEAN, &mddev->sb_flags);
                        set_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags);
-                       pr_info("%s set MD_SB_CHANGE_PENDING\n", __func__);
+                       pr_info("%s: md: %s set
MD_SB_CHANGE_PENDING\n", __func__, mdname(mddev));
                        md_wakeup_thread(mddev->thread);
                        did_change = 1;
                }

Thanks,
Song
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help