Thread (9 messages) 9 messages, 4 authors, 2015-10-31

Re: [PATCH] md/raid5: fix locking in handle_stripe_clean_event()

From: Neil Brown <hidden>
Date: 2015-10-30 01:35:16
Also in: lkml
Subsystem: software raid (multiple disks) support, the rest · Maintainers: Song Liu, Yu Kuai, Linus Torvalds

On Fri, Oct 30 2015, Roman Gushchin wrote:
29.10.2015, 03:35, "Neil Brown" [off-list ref]:
quoted
On Wed, Oct 28 2015, Roman Gushchin wrote:
quoted
 After commit 566c09c53455 ("raid5: relieve lock contention in get_active_stripe()")
 __find_stripe() is called under conf->hash_locks + hash.
 But handle_stripe_clean_event() calls remove_hash() under
 conf->device_lock.

 Under some cirscumstances the hash chain can be circuited,
 and we get an infinite loop with disabled interrupts and locked hash
 lock in __find_stripe(). This leads to hard lockup on multiple CPUs
 and following system crash.

 I was able to reproduce this behavior on raid6 over 6 ssd disks.
 The devices_handle_discard_safely option should be set to enable trim
 support. The following script was used:

 for i in `seq 1 32`; do
     dd if=/dev/zero of=large$i bs=10M count=100 &
 done

 Signed-off-by: Roman Gushchin [off-list ref]
 Cc: Neil Brown [off-list ref]
 Cc: Shaohua Li [off-list ref]
 Cc: linux-raid@vger.kernel.org
 Cc: [off-list ref] # 3.10 - 3.19
Hi Roman,
 thanks for reporting this and providing a fix.

I'm a bit confused by that stable range: 3.10 - 3.19

The commit you identify as introducing the bug was added in 3.13, so
presumably 3.10, 3.11, 3.12 are not affected.
Sure, it's my mistake. Correct range is 3.13 - 3.19. Sorry.
quoted
Also the bug is still present in mainline, so 4.0, 4.1, 4.2 are also
affected, though the patch needs to be revised a bit for 4.1 and later.
Yes, exactly, but things are a bit more complicated in mainline.
I'll try to prepare a patch for mainline in a couple of days.
Thanks for the confirmation.

Isn't the 4.1 fix just:
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index e5befa356dbe..6e4350a78257 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -3522,16 +3522,16 @@ returnbi:
 		 * no updated data, so remove it from hash list and the stripe
 		 * will be reinitialized
 		 */
-		spin_lock_irq(&conf->device_lock);
 unhash:
+		spin_lock_irq(conf->hash_locks + sh->hash_lock_index);
 		remove_hash(sh);
+		spin_unlock_irq(conf->hash_locks + sh->hash_lock_index);
 		if (head_sh->batch_head) {
 			sh = list_first_entry(&sh->batch_list,
 					      struct stripe_head, batch_list);
 			if (sh != head_sh)
 					goto unhash;
 		}
-		spin_unlock_irq(&conf->device_lock);
 		sh = head_sh;
 
 		if (test_bit(STRIPE_SYNC_REQUESTED, &sh->state))
??

Or maybe
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index e5befa356dbe..704ef7fcfbf8 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -3509,6 +3509,7 @@ returnbi:
 
 	if (!discard_pending &&
 	    test_bit(R5_Discard, &sh->dev[sh->pd_idx].flags)) {
+		int hash;
 		clear_bit(R5_Discard, &sh->dev[sh->pd_idx].flags);
 		clear_bit(R5_UPTODATE, &sh->dev[sh->pd_idx].flags);
 		if (sh->qd_idx >= 0) {
@@ -3522,16 +3523,17 @@ returnbi:
 		 * no updated data, so remove it from hash list and the stripe
 		 * will be reinitialized
 		 */
-		spin_lock_irq(&conf->device_lock);
 unhash:
+		hash = sh->hash_lock_index;
+		spin_lock_irq(conf->hash_locks + hash);
 		remove_hash(sh);
+		spin_unlock_irq(conf->hash_locks + hash);
 		if (head_sh->batch_head) {
 			sh = list_first_entry(&sh->batch_list,
 					      struct stripe_head, batch_list);
 			if (sh != head_sh)
 					goto unhash;
 		}
-		spin_unlock_irq(&conf->device_lock);
 		sh = head_sh;
 
 		if (test_bit(STRIPE_SYNC_REQUESTED, &sh->state))

For personal reasons I would like to get this resolved today or
tomorrow, though it would be silly to rush if there is any uncertainty.

Thanks,
NeilBrown

Attachments

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help