Re: How do I tell which disk failed?
From: Stan Hoeppner <hidden>
Date: 2013-01-08 23:03:00
On 1/8/2013 3:54 PM, Ross Boylan wrote:
I am less excited about that since discovering the message about sdb does not mean it's running at over 100 degrees celsius (the raw value is around 45).
You must ignore the VALUE and WORST columns for drive temp. These are "normalized" values only the smartmon idiots understand. The actual temp of 45C is a bit high, but well within the operating range for that drive. The WDC drives have a max temp (failure) of 80C IIRC, and a normal max operating temp of 65C. So you don't need to worry about this drive's temp.
The logs from the restart show Jan 7 17:19:09 markov kernel: [ 2.928055] ata2.00: SATA link down (SStatus 0 SControl 0) Jan 7 17:19:09 markov kernel: [ 2.928102] ata2.01: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Jan 7 17:19:09 markov kernel: [ 2.944459] ata2.01: ATA-8: WDC WD2003FYYS-02W0B1, 01.01D02, max UDMA/133
Jan 7 17:19:09 markov kernel: [ 2.220056] ata1.00: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Jan 7 17:19:09 markov kernel: [ 2.220103] ata1.01: SATA link down (SStatus 0 SControl 310) Jan 7 17:19:09 markov kernel: [ 2.228670] ata1.00: ATA-8: ST3750330NS, SN05, max UDMA/133
the SATA link down messages sound a little odd.
No mystery here. These ports (links) are down because no drives are connected to them, apparently. Show full dmesg output, and tell us the SAS/SATA controller and port count on each for the system in question. -- Stan