Thread (4 messages) 4 messages, 3 authors, 2023-11-28

Re: [PATCH] md/raid6: use valid sector values to determine if an I/O should wait on the reshape

From: Laurence Oberman <hidden>
Date: 2023-11-28 19:01:38

On Tue, 2023-11-28 at 10:44 -0800, Song Liu wrote:
Hi David and Laurence,

On Tue, Nov 28, 2023 at 10:13 AM David Jeffery [off-list ref]
wrote:
quoted
During a reshape or a RAID6 array such as expanding by adding an
additional
disk, I/Os to the region of the array which have not yet been
reshaped can
stall indefinitely. This is from errors in the
stripe_ahead_of_reshape
function causing md to think the I/O is to a region in the actively
undergoing the reshape.

stripe_ahead_of_reshape fails to account for the q disk having a
sector
value of 0. By not excluding the q disk from the for loop, raid6
will always
generate a min_sector value of 0, causing a return value which
stalls.

The function's max_sector calculation also uses min() when it
should use
max(), causing the max_sector value to always be 0. During a
backwards
rebuild this can cause the opposite problem where it allows I/O to
advance
when it should wait.

Fixing these errors will allow safe I/O to advance in a timely
manner and
delay only I/O which is unsafe due to stripes in the middle of
undergoing
the reshape.

Fixes: 486f60558607 ("md/raid5: Check all disks in a stripe_head
for reshape progress")
Signed-off-by: David Jeffery <redacted>
Tested-by: Laurence Oberman <redacted>
Thanks for the fix!

Can we add a test to mdadm/tests to cover this case? (Not sure how
reliable
the test will be).

Song
quoted
---
 drivers/md/raid5.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index dc031d42f53b..26e1e8a5e941 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -5892,11 +5892,11 @@ static bool stripe_ahead_of_reshape(struct
mddev *mddev, struct r5conf *conf,
        int dd_idx;

        for (dd_idx = 0; dd_idx < sh->disks; dd_idx++) {
-               if (dd_idx == sh->pd_idx)
+               if (dd_idx == sh->pd_idx || dd_idx == sh->qd_idx)
                        continue;

                min_sector = min(min_sector, sh-
quoted
dev[dd_idx].sector);
-               max_sector = min(max_sector, sh-
quoted
dev[dd_idx].sector);
+               max_sector = max(max_sector, sh-
quoted
dev[dd_idx].sector);
        }

        spin_lock_irq(&conf->device_lock);
--
2.43.0
Hello Song

The way this was discovered is the customer forced a reshape of a raid6
device by adding an additional device. 
The reshape then sess the hang.

# mdadm /dev/md0 -a /dev/nvme18n1
mdadm: added /dev/nvme18n1

# mdadm --grow --raid-devices=6 /dev/md0

The problem was we needed a test kernel just for the customer to
recover, because on reboot the reshape starts again and hangs the
device access.

Such steps could be added for a test but David will know maybe an
easier way for a test.

Regards
Laurence


Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help