Re: raid1 repair does not repair errors?

From: NeilBrown <hidden>
Date: 2014-02-04 22:51:56

On Tue, 04 Feb 2014 23:34:43 +0400 Michael Tokarev [off-list ref] wrote:

04.02.2014 08:30, NeilBrown wrote:
[]

quoted

I'm really on a roll here, aren't I.

Well, we both are, unless I don't understand what "on a roll" means :)

"on a roll" usually means "enjoying a series of successes" though it can be
used ironically to mean "suffering a series of failures".  I intended the
second meaning...

quoted

I looked again and that code I've been trying to fix as actually perfectly
fine.  I'm not sure whether to be happy to sad about that.

But... I've found the bug.  I know this time because I actually tested it.
I tested and current mainline and it didn't work.  So I hunted and found a
bug.
But that buggy code isn't in 3.10.
So I tested 3.10 and it crashed.
Ah-ha I though.  So I looked at 3.10.27, and  it has different code.  It has
the buggy code.  So I tested that and  it didn't work.
Then I applied  the patch below, and now it does.

The bug was introduced by

commit 30bc9b53878a9921b02e3b5bc4283ac1c6de102a
Author: NeilBrown [off-list ref]
Date:   Wed Jul 17 15:19:29 2013 +1000

    md/raid1: fix bio handling problems in process_checks()

which moved the clearing for bi_flags up in a function to before it was
tested.  That wasn't really the right thing to do.

When that was backported to 3.10 it fixed the crash, but introduce this new
bug.

Anyway enough of my rambling - here is the patch.  As I don't much feel like
trusting my own results just a the moment I look forward to your
confirmation, one way or the other.

Wow.  I see.
Indeed, I'm running latest 3.10 now, 3.10.28.  I never really thought
about testing other versions, because, well, this didn't look like some
new issue to me, I thought it is some old stuff which hasn't changed
much in 3.13 and up.  Well, if either of us knew it is specific to 3.10.y,
we'd both behave differently from the beginning, aren't we? :)

So I tried your patch (on top of my initial just-the-debugging changes), had to
fix a few MIME =damages on the go, but that is not really interesing.  And
this version actually appears to work, but does it silently.

I probably should get md to be a little more verbose when it tries to fix IO
errors.  I people like to know....

After a repair run with your last patch applied, I see this:

[  767.456457] md: requested-resync of RAID array md1
[  767.486818] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
[  767.517404] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for requested-resync.
[  767.548977] md: using 128k window, over a total of 2096064k.
[  808.174908] ata6.00: exception Emask 0x0 SAct 0x7fffffff SErr 0x0 action 0x0
[  808.206395] ata6.00: irq_stat 0x40000008
[  808.237186] ata6.00: failed command: READ FPDMA QUEUED
[  808.267635] ata6.00: cmd 60/80:00:00:3e:3e/00:00:00:00:00/40 tag 0 ncq 65536 in
[  808.267635]          res 41/40:00:23:3e:3e/00:00:00:00:00/40 Emask 0x409 (media error) <F>
[  808.329226] ata6.00: status: { DRDY ERR }
[  808.359915] ata6.00: error: { UNC }
[  808.392438] ata6.00: configured for UDMA/133
[  808.421989] sd 5:0:0:0: [sdd] Unhandled sense code
[  808.451361] sd 5:0:0:0: [sdd]
[  808.480329] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  808.509679] sd 5:0:0:0: [sdd]
[  808.538719] Sense Key : Medium Error [current] [descriptor]
[  808.568061] Descriptor sense data with sense descriptors (in hex):
[  808.597257]         72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
[  808.626981]         00 3e 3e 23
[  808.656380] sd 5:0:0:0: [sdd]
[  808.685550] Add. Sense: Unrecovered read error - auto reallocate failed
[  808.715375] sd 5:0:0:0: [sdd] CDB:
[  808.744933] Read(10): 28 00 00 3e 3e 00 00 00 80 00
[  808.774678] end_request: I/O error, dev sdd, sector 4079139
[  808.804412] end_sync_read: !BIO_UPTODATE
[  808.834040] ata6: EH complete
[  809.486124] md: md1: requested-resync done.

and now, all pending sectors are gone from the drive, and subsequent reads
of this place does not produce any errors.

Excellent!

However, mismatch_cnt right after this repair run shows 128 (and never goes
larger than 0 on subsequent repair runs).  I'm not sure what this 128 really
means, shouldn't it be just one for a single unreadable 512 bytes?

md/raid1 doesn't read individual sectors - it reads 64K at a time and if it
sees a problem it reports that as 128 sectors.  I agree this isn't ideal, but
refining the error down to just one sector is a lot of work for fairly little
gain.

At the same time, mdadm --monitor reports:

Feb  4 23:19:24 mother mdadm[4793]: RebuildFinished event detected on md device /dev/md1
Feb  4 23:21:13 mother mdadm[4793]: RebuildFinished event detected on md device /dev/md1, component device  mismatches found: 128 (on raid level 1)

So, your patch appears to work now, the only issue is that it is too silent:
I'd expect to see at least some mention of "repairing this or that block", or
something like that.

Meanwhile I found an interesting option of hdparm -- it is --make-bad-sector.
So, despite all the warnings around it, I tried it on this very same prod.
server, and marked the same sector as bad again, and re-run the whole thing
(verifying that read of that sector actually produces an error).  And it all
repeated exactly: repair run silently fixed the error and reported 128 found
mismatches, and after repair run, this place is readable again.


(What I'd love to see now, which is not related to mdadm in any way - is an
ability to remap this place on the drive once and for all, making the first
Reallocate_Event_Count to actually happen, to not bother with it ever again.
As was possible with old good scsi drives, for many years..  Anyone know if
it still possible today with sata drives?  To remap this place and be done
with it, instead of repeating the same - rewrite, it is good now, but with
time it becomes unreadable, so rewrite it again, ad infinitum...)

quoted

Thanks,

Thank you!

Should I try 3.13 kernel too (now when I know how to make a bad sector),
just to verify it works fine without additional patches?

No, the same bug is present in every kernel since 3.10.something.
I'll send a patch upstream soon now that I have definite confirmation from
you that it works.

Thanks,
NeilBrown

Attachments

signature.asc [application/pgp-signature] 828 bytes

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help