Thread (20 messages) 20 messages, 4 authors, 2013-02-10

Re: RAID5 with 2 drive failure at the same time

From: Phil Turmel <hidden>
Date: 2013-02-03 01:22:01

On 02/02/2013 10:55 AM, Christoph Nelles wrote:

[trim /]
You are right, the Hitachis support that. I thought disabled means not
possible. My fault.
Nevertheless I put the smartctl -x -a logs at
http://evilazrael.net/bilder2/logs/smart_xa_20130202.tar.gz
Very good.
I am currently reading about TLER, and i am wondering why I haven't
heard of that before. Looks like the lower power consumption is not the
only advantage of the WDC Red Edition. Most reviews do not go so deep
into detail.
"TLER" == "Time Limited Error Recovery", which is WD's name for "SCTERC"
== "Sata Command Transport, Error Recovery Control".  Same purpose.
sdg is a new WDC Red I bought today so all drives from sdg moved one
letter down.

Spent the last three hours analysing why the second onboard controller
does not detect the new HDD. In the end it's a Marvell, IOMMU and linux
driver problem:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1005226
https://bugzilla.kernel.org/show_bug.cgi?id=42679
That sucks.
Marvell = PITA :(
Indeed.

[trim /]
quoted
If you did destroy that drive's contents, you need to clean up the UREs
on the other drives with dd_rescue, then "--assemble --force" with the
remaining drives.
ddrescue is running, this will take some hours.
Ok.
quoted
I think it would be useful to provide a fresh set of "mdadm --examine"
reports for all member disks, along with a partial listing of
/dev/disk/by-id/ that shows what serial numbers are assigned to what
device names.
How do the serial numbers help?
It is vital to keep track of raid device number (logical position in the
array) versus drive serial numbers, as device names are not guaranteed
to be consistent between boots (and certainly not when mucking around
with cables and connectors).
I attached both to this mail.
Ok.

Summarizing:

ata-SAMSUNG_SSD_830_Series_S0XYNEAC504407 -> ../../sda
ata-ST3000DM001-9YN166_Z1F0D9AW -> ../../sdb
ata-WDC_WD30EZRX-00MMMB0_WD-WMAWZ0236402 -> ../../sdc
ata-WDC_WD30EZRS-00J99B0_WD-WCAWZ0319650 -> ../../sdd
ata-WDC_WD30EFRX-68AX9N0_WD-WMC1T1267036 -> ../../sde
ata-WDC_WD30EURS-63R8UY0_WD-WCAWZ2236938 -> ../../sdf
ata-WDC_WD30EFRX-68AX9N0_WD-WMC1T2001070 -> ../../sdg
ata-Hitachi_HDS723030ALA640_MK0311YHG6DS3A -> ../../sdh
ata-Hitachi_HDS723030ALA640_MK0311YHG32VNA -> ../../sdi
ata-Hitachi_HDS723030ALA640_MK0311YHG248EA -> ../../sdj
ata-WDC_WD30EZRX-00MMMB0_WD-WCAWZ1394037 -> ../../sdk

and

/dev/sdb1:
   Device Role : Active device 6
/dev/sdc1:
   Device Role : Active device 0
/dev/sdd1:
   Device Role : Active device 3
/dev/sde1:
   Device Role : Active device 8
/dev/sdf1:
   Device Role : Active device 7
/dev/sdh1:
   Device Role : spare
/dev/sdi1:
   Device Role : Active device 2
/dev/sdj1:
   Device Role : Active device 4
/dev/sdk1:
   Device Role : Active device 5

When you are done with dd_rescue, make sure of the mapping again.
lsdrv[1] gives you both pieces of information in one utility, you might
find it easier than mapping by hand.

Phil

[1] http://github.com/pturmel/lsdrv

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help