Re: New RAID causing system lockups
From: Mike Hartman <hidden>
Date: 2010-09-14 02:50:10
quoted hunk ↗ jump to hunk
quoted
quoted
Hi Mike, thanks for the updates. I'm not entirely clear what is happening (in fact, due to a cold that I am still fighting off, nothing is entirely clear at the moment), but it looks very likely that the problem is due to an interplay between barrier handling, and the multi-level structure of your array (a raid0 being a member of a raid5). When a barrier request is processed, both arrays will schedule 'work' to be done by the 'event' thread and I'm guess that you can get into a situation where one work time is wait for the other, but the other is behind the one on the single queue (I wonder if that make sense...) Anyway, this patch might make a difference, It reduced the number of work items schedule in a way that could conceivably fix the problem. If you can test this, please report the results. I cannot easily reproduce the problem so there is limited testing that I can do. Thanks, NeilBrowndiff --git a/drivers/md/md.c b/drivers/md/md.c index f20d13e..7f2785c 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c@@ -294,6 +294,23 @@ EXPORT_SYMBOL(mddev_congested);#define POST_REQUEST_BARRIER ((void*)1) +static void md_barrier_done(mddev_t *mddev) +{ + struct bio *bio = mddev->barrier; + + if (test_bit(BIO_EOPNOTSUPP, &bio->bi_flags)) + bio_endio(bio, -EOPNOTSUPP); + else if (bio->bi_size == 0) + bio_endio(bio, 0); + else { + /* other options need to be handled from process context */ + schedule_work(&mddev->barrier_work); + return; + } + mddev->barrier = NULL; + wake_up(&mddev->sb_wait); +} + static void md_end_barrier(struct bio *bio, int err) { mdk_rdev_t *rdev = bio->bi_private;@@ -310,7 +327,7 @@ static void md_end_barrier(struct bio *bio, int err)wake_up(&mddev->sb_wait); } else /* The pre-request barrier has finished */ - schedule_work(&mddev->barrier_work); + md_barrier_done(mddev); } bio_put(bio); }@@ -350,18 +367,12 @@ static void md_submit_barrier(struct work_struct *ws)atomic_set(&mddev->flush_pending, 1); - if (test_bit(BIO_EOPNOTSUPP, &bio->bi_flags)) - bio_endio(bio, -EOPNOTSUPP); - else if (bio->bi_size == 0) - /* an empty barrier - all done */ - bio_endio(bio, 0); - else { - bio->bi_rw &= ~REQ_HARDBARRIER; - if (mddev->pers->make_request(mddev, bio)) - generic_make_request(bio); - mddev->barrier = POST_REQUEST_BARRIER; - submit_barriers(mddev); - } + bio->bi_rw &= ~REQ_HARDBARRIER; + if (mddev->pers->make_request(mddev, bio)) + generic_make_request(bio); + mddev->barrier = POST_REQUEST_BARRIER; + submit_barriers(mddev); + if (atomic_dec_and_test(&mddev->flush_pending)) { mddev->barrier = NULL; wake_up(&mddev->sb_wait);@@ -383,7 +394,7 @@ void md_barrier_request(mddev_t *mddev, struct bio *bio)submit_barriers(mddev); if (atomic_dec_and_test(&mddev->flush_pending)) - schedule_work(&mddev->barrier_work); + md_barrier_done(mddev); } EXPORT_SYMBOL(md_barrier_request);Neil, thanks for the patch. I experienced the lockup for the 5th time an hour ago (about 3 hours after the last hard reboot) so I thought it would be a good time to try your patch. Unfortunately I'm getting an error: patching file drivers/md/md.c Hunk #1 succeeded at 291 with fuzz 1 (offset -3 lines). Hunk #2 FAILED at 324. Hunk #3 FAILED at 364. Hunk #4 FAILED at 391. 3 out of 4 hunks FAILED -- saving rejects to file drivers/md/md.c.rejThat is odd. I took the md.c that you posted on the web site, use "patch" to apply my patch to it, and only Hunk #3 failed. I used 'wiggle' to apply the patch and it applied perfectly, properly replacing (1<<BIO_RW_BARRIER) with REQ_HARDBARRIER (or the other way around). Try this version. You will need to be in drivers/md/, or use patch drivers/md/md.c < this-patch NeilBrown--- md.c.orig 2010-09-14 11:29:15.000000000 +1000 +++ md.c 2010-09-14 11:29:50.000000000 +1000@@ -291,6 +291,23 @@#define POST_REQUEST_BARRIER ((void*)1) +static void md_barrier_done(mddev_t *mddev) +{ + struct bio *bio = mddev->barrier; + + if (test_bit(BIO_EOPNOTSUPP, &bio->bi_flags)) + bio_endio(bio, -EOPNOTSUPP); + else if (bio->bi_size == 0) + bio_endio(bio, 0); + else { + /* other options need to be handled from process context */ + schedule_work(&mddev->barrier_work); + return; + } + mddev->barrier = NULL; + wake_up(&mddev->sb_wait); +} + static void md_end_barrier(struct bio *bio, int err) { mdk_rdev_t *rdev = bio->bi_private;@@ -307,7 +324,7 @@wake_up(&mddev->sb_wait); } else /* The pre-request barrier has finished */ - schedule_work(&mddev->barrier_work); + md_barrier_done(mddev); } bio_put(bio); }@@ -347,18 +364,12 @@atomic_set(&mddev->flush_pending, 1); - if (test_bit(BIO_EOPNOTSUPP, &bio->bi_flags)) - bio_endio(bio, -EOPNOTSUPP); - else if (bio->bi_size == 0) - /* an empty barrier - all done */ - bio_endio(bio, 0); - else { - bio->bi_rw &= ~(1<<BIO_RW_BARRIER); - if (mddev->pers->make_request(mddev, bio)) - generic_make_request(bio); - mddev->barrier = POST_REQUEST_BARRIER; - submit_barriers(mddev); - } + bio->bi_rw &= ~(1<<BIO_RW_BARRIER); + if (mddev->pers->make_request(mddev, bio)) + generic_make_request(bio); + mddev->barrier = POST_REQUEST_BARRIER; + submit_barriers(mddev); + if (atomic_dec_and_test(&mddev->flush_pending)) { mddev->barrier = NULL; wake_up(&mddev->sb_wait);@@ -380,7 +391,7 @@submit_barriers(mddev); if (atomic_dec_and_test(&mddev->flush_pending)) - schedule_work(&mddev->barrier_work); + md_barrier_done(mddev); } EXPORT_SYMBOL(md_barrier_request);
Sorry about that Neil, it was my fault. I copied your patch out of the email and I think it picked up some unintended characters. I tried copying it from the mailing list archive website instead and it patched in fine. The kernel compiled with no trouble and I'm booted into it now. No unexpected side-effects yet. I'll continue with my copying and we'll see if it locks up again. Thanks for all your help! Mike -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html