Re: Deadlock in md barrier code? / RAID1 / LVM CoW snapshot + ext3 / Debian 5.0 - lenny 2.6.26 kernel
From: Neil Brown <hidden>
Date: 2010-09-21 22:21:54
On Mon, 20 Sep 2010 20:59:29 +0100 Tim Small [off-list ref] wrote:
quoted
unfortunately I need more that just the set of blocked tasks to diagnose the problem. If you could get the result of echo t > /proc/sysrq-trigger that might help a lot. This might be bigger than the dmesg buffer, so you might try booting with 'log_buf_len=1M' just to be sure.Hi Neil, Thanks for the feedback. I've stuck the sysrq-t output here: http://buttersideup.com/files/md-raid1-lockup-lvm-snapshot/iodeadlock-sysrq-t.txt
Unfortunately this log is not complete. As I suggested, you need to boot with a larger log_buf_len (you seem to have 128K) to get able to capture the whole thing. NeilBrown
... this was soon after the io to md2 stopped - md0 seems fine...
oldshoreham:~# cat /proc/mdstat
Personalities : [raid1]
md2 : active raid1 sda6[0] sdb6[1]
404600128 blocks [2/2] [UU]
[>....................] resync = 0.1% (437056/404600128)
finish=343321.2min speed=19K/sec
... I also tried an older Debian 5.0.x kernel from Mar 2009, which is a
less-patched 2.6.26, and got the same results. 2.6.32 hasn't deadlocked
after 10 minutes (2.6.26 usually does within a minute of boot-up), so
I'll leave it re-syncing overnight...
Cheers!
Tim.