Re: raid1 repair does not repair errors?
From: NeilBrown <hidden>
Date: 2014-02-04 22:51:56
On Tue, 04 Feb 2014 23:34:43 +0400 Michael Tokarev [off-list ref] wrote:
04.02.2014 08:30, NeilBrown wrote: []quoted
I'm really on a roll here, aren't I.Well, we both are, unless I don't understand what "on a roll" means :)
"on a roll" usually means "enjoying a series of successes" though it can be used ironically to mean "suffering a series of failures". I intended the second meaning...
quoted
I looked again and that code I've been trying to fix as actually perfectly fine. I'm not sure whether to be happy to sad about that. But... I've found the bug. I know this time because I actually tested it. I tested and current mainline and it didn't work. So I hunted and found a bug. But that buggy code isn't in 3.10. So I tested 3.10 and it crashed. Ah-ha I though. So I looked at 3.10.27, and it has different code. It has the buggy code. So I tested that and it didn't work. Then I applied the patch below, and now it does. The bug was introduced by commit 30bc9b53878a9921b02e3b5bc4283ac1c6de102a Author: NeilBrown [off-list ref] Date: Wed Jul 17 15:19:29 2013 +1000 md/raid1: fix bio handling problems in process_checks() which moved the clearing for bi_flags up in a function to before it was tested. That wasn't really the right thing to do. When that was backported to 3.10 it fixed the crash, but introduce this new bug. Anyway enough of my rambling - here is the patch. As I don't much feel like trusting my own results just a the moment I look forward to your confirmation, one way or the other.Wow. I see. Indeed, I'm running latest 3.10 now, 3.10.28. I never really thought about testing other versions, because, well, this didn't look like some new issue to me, I thought it is some old stuff which hasn't changed much in 3.13 and up. Well, if either of us knew it is specific to 3.10.y, we'd both behave differently from the beginning, aren't we? :) So I tried your patch (on top of my initial just-the-debugging changes), had to fix a few MIME =damages on the go, but that is not really interesing. And this version actually appears to work, but does it silently.
I probably should get md to be a little more verbose when it tries to fix IO errors. I people like to know....
After a repair run with your last patch applied, I see this:
[ 767.456457] md: requested-resync of RAID array md1
[ 767.486818] md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
[ 767.517404] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for requested-resync.
[ 767.548977] md: using 128k window, over a total of 2096064k.
[ 808.174908] ata6.00: exception Emask 0x0 SAct 0x7fffffff SErr 0x0 action 0x0
[ 808.206395] ata6.00: irq_stat 0x40000008
[ 808.237186] ata6.00: failed command: READ FPDMA QUEUED
[ 808.267635] ata6.00: cmd 60/80:00:00:3e:3e/00:00:00:00:00/40 tag 0 ncq 65536 in
[ 808.267635] res 41/40:00:23:3e:3e/00:00:00:00:00/40 Emask 0x409 (media error) <F>
[ 808.329226] ata6.00: status: { DRDY ERR }
[ 808.359915] ata6.00: error: { UNC }
[ 808.392438] ata6.00: configured for UDMA/133
[ 808.421989] sd 5:0:0:0: [sdd] Unhandled sense code
[ 808.451361] sd 5:0:0:0: [sdd]
[ 808.480329] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[ 808.509679] sd 5:0:0:0: [sdd]
[ 808.538719] Sense Key : Medium Error [current] [descriptor]
[ 808.568061] Descriptor sense data with sense descriptors (in hex):
[ 808.597257] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
[ 808.626981] 00 3e 3e 23
[ 808.656380] sd 5:0:0:0: [sdd]
[ 808.685550] Add. Sense: Unrecovered read error - auto reallocate failed
[ 808.715375] sd 5:0:0:0: [sdd] CDB:
[ 808.744933] Read(10): 28 00 00 3e 3e 00 00 00 80 00
[ 808.774678] end_request: I/O error, dev sdd, sector 4079139
[ 808.804412] end_sync_read: !BIO_UPTODATE
[ 808.834040] ata6: EH complete
[ 809.486124] md: md1: requested-resync done.
and now, all pending sectors are gone from the drive, and subsequent reads
of this place does not produce any errors.Excellent!
However, mismatch_cnt right after this repair run shows 128 (and never goes larger than 0 on subsequent repair runs). I'm not sure what this 128 really means, shouldn't it be just one for a single unreadable 512 bytes?
md/raid1 doesn't read individual sectors - it reads 64K at a time and if it sees a problem it reports that as 128 sectors. I agree this isn't ideal, but refining the error down to just one sector is a lot of work for fairly little gain.
At the same time, mdadm --monitor reports: Feb 4 23:19:24 mother mdadm[4793]: RebuildFinished event detected on md device /dev/md1 Feb 4 23:21:13 mother mdadm[4793]: RebuildFinished event detected on md device /dev/md1, component device mismatches found: 128 (on raid level 1) So, your patch appears to work now, the only issue is that it is too silent: I'd expect to see at least some mention of "repairing this or that block", or something like that. Meanwhile I found an interesting option of hdparm -- it is --make-bad-sector. So, despite all the warnings around it, I tried it on this very same prod. server, and marked the same sector as bad again, and re-run the whole thing (verifying that read of that sector actually produces an error). And it all repeated exactly: repair run silently fixed the error and reported 128 found mismatches, and after repair run, this place is readable again. (What I'd love to see now, which is not related to mdadm in any way - is an ability to remap this place on the drive once and for all, making the first Reallocate_Event_Count to actually happen, to not bother with it ever again. As was possible with old good scsi drives, for many years.. Anyone know if it still possible today with sata drives? To remap this place and be done with it, instead of repeating the same - rewrite, it is good now, but with time it becomes unreadable, so rewrite it again, ad infinitum...)quoted
Thanks,Thank you! Should I try 3.13 kernel too (now when I know how to make a bad sector), just to verify it works fine without additional patches?
No, the same bug is present in every kernel since 3.10.something. I'll send a patch upstream soon now that I have definite confirmation from you that it works. Thanks, NeilBrown
Attachments
- signature.asc [application/pgp-signature] 828 bytes