Re: request help with RAID1 array that endlessly attempts to sync

request help with RAID1 array that endlessly attempts to sync · Julie Ashworth <hidden> · 2013-12-17
Re: request help with RAID1 array that endlessly attempts to sync · Julie Ashworth <hidden> · 2013-12-17
Re: request help with RAID1 array that endlessly attempts to sync · Phil Turmel <hidden> · 2013-12-17
Re: request help with RAID1 array that endlessly attempts to sync · Julie Ashworth <hidden> · 2013-12-17
Re: request help with RAID1 array that endlessly attempts to sync · Phil Turmel <hidden> · 2013-12-17
Re: request help with RAID1 array that endlessly attempts to sync · David C. Rankin <hidden> · 2013-12-17
Re: request help with RAID1 array that endlessly attempts to sync · Julie Ashworth <hidden> · 2013-12-18
Re: request help with RAID1 array that endlessly attempts to sync · Phil Turmel <hidden> · 2013-12-18
Re: request help with RAID1 array that endlessly attempts to sync · Julie Ashworth <hidden> · 2014-01-21
Re: request help with RAID1 array that endlessly attempts to sync · Phil Turmel <hidden> · 2014-01-21
Re: request help with RAID1 array that endlessly attempts to sync · Julie Ashworth <hidden> · 2014-02-25
Re: request help with RAID1 array that endlessly attempts to sync · Wilson Jonathan <hidden> · 2013-12-17

From: Phil Turmel <hidden>
Date: 2013-12-17 17:55:51

Hi Julie,

On 12/17/2013 11:53 AM, Julie Ashworth wrote:

hi all, The sync ran overnight, and smartctl reports 60 errors on
/dev/sdb this morning. So, it seems like the drive is doomed.

You haven't actually posted enough data from smartctl to say that,
though failures in the vicinity of three years is not surprising.

Please post the output of "smartctl -x" for both of these drives.

It's frustrating, because this has happened twice in the last month,
where a disk failed in a RAID1, I replaced the drive, and the 'good'
drive failed during the sync. Last time I rebuilt from scratch. I
presume that is my fate this time.

"Good drives failing during rebuild" is a big red flag suggesting
timeout mismatches combined with lack of scrubbing.

I plan to use RAID6 in the future, but I still have important servers
with RAID1 arrays. Do you folks recommend replacing HDDs before they
report errors? The drives are all ~3 years old - Seagate.

I replace drives when they reach 10 relocations, given weekly scrubs.

I should probably stop the sync. I presume the best way to do this is
to fail/remove /dev/sda (the new disk).

Maybe not.  Please tell us you know all about error recovery timeouts
and the timeout mismatch problem commonly encountered with
consumer-grade hard drives.  Otherwise, you might want search the list
archives for various combinations of the keywords "scterc", "error
recovery", "timeout mismatch", "URE", and/or "bit error rate".

Phil

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help