Re: raid5 Journal Recovery Bug
From: Song Liu <song@kernel.org>
Date: 2022-08-22 21:12:44
On Mon, Aug 22, 2022 at 1:40 PM Logan Gunthorpe [off-list ref] wrote:
On 2022-08-22 13:12, Song Liu wrote:quoted
On Mon, Aug 22, 2022 at 9:28 AM Logan Gunthorpe [off-list ref] wrote:quoted
On 2022-08-22 01:04, Song Liu wrote:quoted
Could you please add some printk so that we know which condition triggered handle_stripe_fill() here: if (s.to_read || s.non_overwrite || (s.to_write && s.failed) || (s.syncing && (s.uptodate + s.compute < disks)) || s.replacing || s.expanding) handle_stripe_fill(sh, &s, disks); This would help us narrow down to the exact condition. I guess it is "(s.to_write && s.failed)", but I am not quite sure.Ok, I hit this bug on a stripe and got these values for the call: to_read = 0 non_overwrite = 0 to_write = 0 failed = 1 syncing = 1 uptodate = 2 compute = 0 disks = 3 replacing = 0 expanding = 0 So it's actually the "(s.syncing && (s.uptodate + s.compute < disks))" condition that is getting hit.Thanks for the information! So the stripe is syncing. Could you please try whether the following fixes the issue?Yup, thanks! Looks like that fixes my test case. I can do more general testing on it later this week.
Awesome! Could you please run more tests and submit the patch? Thanks, Song