Thread (18 messages) 18 messages, 9 authors, 2008-07-08

Re: Software RAID1 deadlock in 2.6.25 kernels

From: Michael Bussmann <hidden>
Date: 2008-07-01 19:45:08

Hi,

On 2008-07-01 13:00:01 -0400, Mike McCarthy wrote:
Bill Davidsen wrote:
quoted
Mike McCarthy wrote:
quoted
Bill Davidsen wrote:
quoted
Wonder if hardware or software is happening, sounds like an  
mishandled hardware error, but I'm guessing. I have a server with  
RAID1 and Fedora 2.6.22.14-72.fc6PAE kernel, up 72 days, no 
problems.
I have a number of 2.6.25.7-9 machines with SW-RAID1 that are running
flawlessly so far.
quoted
Given heavy 2.6.25 use, my guess is still that the root cause of this  
is hardware, and that the change in disk code either triggers the  
hardware problem, or handles it differently. Are you by any chance  
running NCQ on your system?
No.  This system and the drives pre-date NCQ.  I think NCQ is only  
Same here.  In my case the lockups are totally random and not related to
heavy disc i/o.  Actually most lockups occur when the system was quite idle.

So far I tried
	- Replaced IDE cables
	- kernel upgrades up to 2.6.25.9
	- removed one drive from the RAID, thus running in degraded mode
	- disabled CPU frequency scaling
	- put one drive on the PDC20276, the other on the ICH4 (82801DB)

Maybe my hardware _is_ broken, but I'll try some other settings anyway
(including using RTC again for ztdummy instead of HPET, disabling NOHZ etc).

Cheers,
MB

-- 
Michael Bussmann [off-list ref]
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help