Re: [PATCH] md/raid6: use valid sector values to determine if an I/O should... | linux-raid

Re: [PATCH] md/raid6: use valid sector values to determine if an I/O should wait on the reshape

From: Laurence Oberman <hidden>
Date: 2023-11-28 19:01:38

On Tue, 2023-11-28 at 10:44 -0800, Song Liu wrote:

Hi David and Laurence,

On Tue, Nov 28, 2023 at 10:13 AM David Jeffery [off-list ref]
wrote:

quoted

During a reshape or a RAID6 array such as expanding by adding an
additional
disk, I/Os to the region of the array which have not yet been
reshaped can
stall indefinitely. This is from errors in the
stripe_ahead_of_reshape
function causing md to think the I/O is to a region in the actively
undergoing the reshape.

stripe_ahead_of_reshape fails to account for the q disk having a
sector
value of 0. By not excluding the q disk from the for loop, raid6
will always
generate a min_sector value of 0, causing a return value which
stalls.

The function's max_sector calculation also uses min() when it
should use
max(), causing the max_sector value to always be 0. During a
backwards
rebuild this can cause the opposite problem where it allows I/O to
advance
when it should wait.

Fixing these errors will allow safe I/O to advance in a timely
manner and
delay only I/O which is unsafe due to stripes in the middle of
undergoing
the reshape.

Fixes: 486f60558607 ("md/raid5: Check all disks in a stripe_head
for reshape progress")
Signed-off-by: David Jeffery <redacted>
Tested-by: Laurence Oberman <redacted>

Thanks for the fix!

Can we add a test to mdadm/tests to cover this case? (Not sure how
reliable
the test will be).

Song

quoted

---
 drivers/md/raid5.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index dc031d42f53b..26e1e8a5e941 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c

@@ -5892,11 +5892,11 @@ static bool stripe_ahead_of_reshape(struct

mddev *mddev, struct r5conf *conf,
        int dd_idx;

        for (dd_idx = 0; dd_idx < sh->disks; dd_idx++) {
-               if (dd_idx == sh->pd_idx)
+               if (dd_idx == sh->pd_idx || dd_idx == sh->qd_idx)
                        continue;

                min_sector = min(min_sector, sh-

quoted

dev[dd_idx].sector);

-               max_sector = min(max_sector, sh-

quoted

dev[dd_idx].sector);

+               max_sector = max(max_sector, sh-

quoted

dev[dd_idx].sector);

        }

        spin_lock_irq(&conf->device_lock);
--
2.43.0

Hello Song

The way this was discovered is the customer forced a reshape of a raid6
device by adding an additional device. 
The reshape then sess the hang.

# mdadm /dev/md0 -a /dev/nvme18n1
mdadm: added /dev/nvme18n1

# mdadm --grow --raid-devices=6 /dev/md0

The problem was we needed a test kernel just for the customer to
recover, because on reboot the reshape starts again and hangs the
device access.

Such steps could be added for a test but David will know maybe an
easier way for a test.

Regards
Laurence

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help