Re: Raid 6, 9 1.5T days drives, 2 "fail" one after the other

From: Phil Turmel <hidden>
Date: 2013-02-19 03:12:11

Hi Neil, Dragos,

On 02/18/2013 07:39 PM, NeilBrown wrote:

On Thu, 07 Feb 2013 10:39:53 -0500 Dragos Dobrescu
[off-list ref] wrote:

quoted

Hi, I need some help. I noticed the server was in recovery mode. It
had just dropped a "faulty" drive. I checked the drive and it
looked like it was working. When the recovery was done I added the
drive back, after recreating the partition.

As soon as I did it, mdadm informed me that it redropped it and
dropped another drive at the same time. I removed both driver and
added a brand new drive (second is on the way) which the system
accepted and started recovery.

What I don't understand is that I plugged the drives on another
computer with sata-USB adapter and performed a full smart checkup
which returned successful, minus a few bad sectors and some
warnings of pas over heating. What is going on? Thank you for your
help.

Dragos

No one replied?  I felt sure someone else would.

No one else saw it.  The message you quoted is not in the archives, and
I never got one directly.

Maybe you  have a problem with your driver card, or with a cable, or 
something.

Or normal UREs.  Dragos, please also share your smartctl reports (with -x).

Normally if md stops a driver there will be messages in the kernel
log about access failures.  Do you still have all the logs from when
this  happened? Are there any messages from the kernel?

NeilBrown

Phil

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help