Re: md_raid: mdX_raid6 looping after sync_action "check" to "idle" transition
From: Yu Kuai <hidden>
Date: 2023-09-05 16:19:41
Also in:
lkml
Hi, 在 2023/08/30 9:36, Yu Kuai 写道:
Hi, 在 2023/08/29 4:32, Dragan Stancevic 写道:quoted
Just a followup on 6.1 testing. I tried reproducing this problem for 5 days with 6.1.42 kernel without your patches and I was not able to reproduce it.
oops, I forgot that you need to backport this patch first to reporduce this problem: https://lore.kernel.org/all/20230529132037.2124527-2-yukuai1@huaweicloud.com/ (local) The patch fix the deadlock as well, but it introduce some regressions. Thanks, Kuai
quoted
It seems that 6.1 has some other code that prevents this from happening.I see that there are lots of patches for raid456 between 5.10 and 6.1, however, I remember that I used to reporduce the deadlock after 6.1, and it's true it's not easy to reporduce, see below: https://lore.kernel.org/linux-raid/e9067438-d713-f5f3-0d3d-9e6b0e9efa0e@huaweicloud.com/ (local) My guess is that 6.1 is harder to reporduce than 5.10 due to some changes inside raid456. By the way, raid10 had a similiar deadlock, and can be fixed the same way, so it make sense to backport these patches. https://lore.kernel.org/r/20230529132037.2124527-5-yukuai1@huaweicloud.com (local) Thanks, Kuaiquoted
On 5.10 I can reproduce it within minutes to an hour..