Thread (2 messages) 2 messages, 2 authors, 2014-05-20

Re: System freeze triggered by RAID 10 resync and high io

From: NeilBrown <hidden>
Date: 2014-05-20 23:26:08

On Tue, 20 May 2014 23:24:36 +0200 christian.schwarz@posteo.de wrote:
Hello,

I have an issue with stalling writes that within a few seconds 
ultimately lead to a full system freeze. Since all writes fail once the 
condition has been triggered, there is no log output. Also dmesg remains 
without errors.

My setup is as follows:

md127 : active raid10 sda1[0] sdd1[3] sdc1[2] sdb1[1]
       3907023872 blocks super 1.2 512K chunks 2 near-copies [4/4] 
[UUUU]

Linux 3.14.2-hardened-r1 #4 SMP PREEMPT
Fixed by upstream commit

commit cc13b1d1500656a20e41960668f3392dda9fa6e2
Author: NeilBrown [off-list ref]
Date:   Mon May 5 13:34:37 2014 +1000

    md/raid10: call wait_barrier() for each request submitted.


which will be in 3.15, and hopefully will appear in the next 3.14.y release
(it isn't in 3.14.4).

NeilBrown

Dell Poweredge T20 Server


To trigger this condition, a resync of the array has to be running and 
high disk IO needs to be performed in addition. The condition is 
triggered regardless of the resync speed (tested with 1, 10 and 100 
MB/s). I also tried different io schedulers and a non-preempt kernel.


Someone else over at Fedora reported a similiar problem: 
http://www.spinics.net/linux/fedora/fedora-kernel/msg05163.html


How can I help with providing additional information so you can locate 
the problem?


Thanks,

Christian

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
  

Attachments

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help