Re: mdadm RAID6 faulty drive

From: Phil Turmel <hidden>
Date: 2013-03-25 17:43:50

On 03/25/2013 12:02 PM, Paramasivam, Meenakshisundaram wrote:

Hi,

As a result of extended power outage, the FedoraCore 17 machine with
mdadm RAID went down. Bringing it up, I noticed "faulty /dev/sdf" in
mdadm -detail. However mdadm -E /dev/sdf shows "State : clean".
Details are shown below. When I tried to add the drive to array,
resync fails (I see lots of eSATA bus resets), and I get the same
message in mdadm -detail.

Questions:
1. How can a clean drive be reported faulty?

When the drive is kicked out for I/O errors its superblock is left as-is
(just as if you pulled its sata cable).  The remaining devices'
superblocks are marked to show the failed drive, and *their*
superblocks' event count is bumped.  The failed status of that device is
derived during assembly when its superblock is found to be stale.

2. Is there a easy way to mark drive (/dev/sdf) as "assume-clean" and
add it?

No.  The closest thing is to use a write-intent bitmap and "re-add"
devices that are disconnected.

That's not your problem.

Please let me know if I should get an exact  replacement drive at
this stage, pull out faulty /dev/sdf, and add the new drive to array.
Thanks.

You very likely need a new drive.  You might want to try plugging that
drive into a different controller, or a different port on the same
controller, just to narrow the diagnosis.

You could also show us some of the kernel error messages, or show the
output of "smartctl -x /dev/sdf".

Phil

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help