Thread (27 messages) 27 messages, 6 authors, 2013-07-04

Re: Mdadm server eating drives

From: Stan Hoeppner <hidden>
Date: 2013-07-03 17:05:56

On 7/3/2013 12:26 AM, Barrett Lewis wrote:
...
This is all about my dedicated server.  The external enclosure with
the 4 drives, 3 of which in a raid0 is just something I used for
creating an emergency backup, and was plugged directly into the server
via USB, (has it's own power supply too).  The server is using the
onboard video card on the Asrock z77 extreme 4.
Got it.
...
The other 2 drives in the picture are the source drives that had the
original data that the array was initially populated with.
Got it.  These questions were simply to get a handle on how much +12V
power you needed before recommending a PSU.

...
I have been really curious about this "beeping" issue since
it is so bizarre.  Anyway like I said only 2 of those original 6 (they
were seagate ST2000DM001) remain.
When power supplies go bad you may witness all kinds of weird things.
If the voltage to the speaker drive circuit fluctuates wildly it can
cause leakage on the output drive, which causes the speaker to make
random noises.
Cheap alternate PSU seemed to work OK so I went to buy a decent
permanent replacement.  I couldn't find either of the two you
suggested at the store (they were closing and I wanted to get this
done).  So I ended up going with a 750w corsair CX750M.  Like magic,
with a new power supply most of the drives seem to be back working,
except the first two that failed out yesterday.  It seems like maybe
the event counters (or something) are too far behind to assemble them
back.  That said, md0 mounts fine and fsck returned clean, so that
deserves some kinda hooray!
The key thing is whether drives keep showing errors in dmesg and
dropping.  If not your problem is likely solved.  :)
Here is some data about the two (sdd and sdf) that won't socialize
with the other disks.

sudo mdadm --assemble --force --verbose /dev/md0 /dev/sd[a-f]
mdadm: looking for devices for /dev/md0
mdadm: /dev/sda is identified as a member of /dev/md0, slot 4.
mdadm: /dev/sdb is identified as a member of /dev/md0, slot 0.
mdadm: /dev/sdc is identified as a member of /dev/md0, slot 5.
mdadm: /dev/sdd is identified as a member of /dev/md0, slot 1.
mdadm: /dev/sde is identified as a member of /dev/md0, slot 3.
mdadm: /dev/sdf is identified as a member of /dev/md0, slot 2.
mdadm: added /dev/sdd to /dev/md0 as 1 (possibly out of date)
mdadm: added /dev/sdf to /dev/md0 as 2 (possibly out of date)
mdadm: added /dev/sde to /dev/md0 as 3
mdadm: added /dev/sda to /dev/md0 as 4
mdadm: added /dev/sdc to /dev/md0 as 5
mdadm: added /dev/sdb to /dev/md0 as 0
mdadm: /dev/md0 has been started with 4 drives (out of 6).


and from dmesg
[ 4481.356723] md: bind<sdd>
[ 4481.356850] md: bind<sdf>
[ 4481.357007] md: bind<sde>
[ 4481.357134] md: bind<sda>
[ 4481.357248] md: bind<sdc>
[ 4481.357365] md: bind<sdb>
[ 4481.357395] md: kicking non-fresh sdf from array!
[ 4481.357400] md: unbind<sdf>
[ 4481.374480] md: export_rdev(sdf)
[ 4481.374484] md: kicking non-fresh sdd from array!
[ 4481.374488] md: unbind<sdd>
[ 4481.394486] md: export_rdev(sdd)
[ 4481.396164] md/raid:md0: device sdb operational as raid disk 0
[ 4481.396168] md/raid:md0: device sdc operational as raid disk 5
[ 4481.396171] md/raid:md0: device sda operational as raid disk 4
[ 4481.396173] md/raid:md0: device sde operational as raid disk 3
[ 4481.396571] md/raid:md0: allocated 6384kB
[ 4481.396805] md/raid:md0: raid level 6 active with 4 out of 6
devices, algorithm 2
[ 4481.396808] RAID conf printout:
[ 4481.396810]  --- level:6 rd:6 wd:4
[ 4481.396812]  disk 0, o:1, dev:sdb
[ 4481.396814]  disk 3, o:1, dev:sde
[ 4481.396815]  disk 4, o:1, dev:sda
[ 4481.396817]  disk 5, o:1, dev:sdc
[ 4481.396848] md0: detected capacity change from 0 to 8001056407552
[ 4481.426011]  md0: unknown partition table

sudo mdadm -E /dev/sd[a-f] | nopaste
http://pastie.org/8105693

sudo smartctl -x /dev/sdd | nopaste
http://pastie.org/8105706

sudo smartctl -x /dev/sdf | nopaste
http://pastie.org/8105707


Are sdd and sdf just too out of sync?  Should I zero the superblocks
and re-add them to the array?  Or I could replace them (I have two
unopened WD reds here, but I'd like to return them if I don't really
need them right now).
I'm not an expert on recovery when things go this far South.  Phil and
others are much more knowledgeable with this so I'll pass the thread
back to them now.
Thanks for the advice about the PSU, I would have never dreamed it
would cause behaviour like that.
You're welcome.  I've spent a just little time around hardware, as you
might have guessed based on my email address.  Started in 1986, so
that's, what, 26 years now?  Damn I'm getting old...

-- 
Stan
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help