Thread (7 messages) 7 messages, 2 authors, 2013-05-07

Re: --grow RAID6 gives: md: md_do_sync() got signal ... exiting + hang

From: NeilBrown <hidden>
Date: 2013-05-07 12:40:51

On Tue, 7 May 2013 14:08:14 +0200 Ole Tange [off-list ref] wrote:
On Tue, May 7, 2013 at 1:54 PM, NeilBrown [off-list ref] wrote:
quoted
On Tue, 7 May 2013 13:36:56 +0200 Ole Tange [off-list ref] wrote:
quoted
I am expanding my 9 harddisk RAID6 to 10 harddisk RAID6:
:
quoted
quoted
It is, however, hanging the system.
:
quoted
quoted
# Do the reshape
mdadm -v --grow /dev/md1 --raid-devices=10
--backup-file=/root/back-md1
mdadm: Need to backup 7168K of critical section..
This completed - did not hang.
quoted
What does
  grep . /sys/block/md1/md/*
show? Or does it hang?
Hangs (ctrl-c works).
quoted
What about "mdadm --examine /dev/sd*"
https://gist.github.com/anonymous/5532063

The disk box contains more drives than just the array in question. The
interesting array is: 242d6530:e2562ecb:1dcd2a97:15a1a868
quoted
Did the "mdadm --grow" appear to complete, and return to the shell prompt?
Yes.
quoted
What kernel version?  What mdadm version?
$ mdadm --version
mdadm - v3.2.5 - 18th May 2012

$ uname -r
3.2.0-0.bpo.1-amd64
quoted
A hanging /proc/mdstat is definitely not a good sign.  The "got signal ...
exiting" isn't good either.  I would expect more messages with that.
You didn't just "grep md" in dmesg did you?  That is a complete dmesg output
for the entire time period that could possibly be relevant?
dmesg of controller upgrade (after which everything worked fine)
followed by --grow at 4328065.432267

https://gist.github.com/anonymous/5532093

/Ole
Thanks for the extra info.  I can't find any smoking gun unfortunately.

What does "ps axgu" show.  I'm particularly looking for processes in 'D'
state.
If there  are any, particularly if they are md related, try
  cat /proc/$PID/stack
for appropriate values of $PID

Maybe also try
   echo t > /proc/sysrq_trigger

and see what gets into 'dmesg' - hopefully your dmesg buffer is big enough to
hold the important stack traces.
If you get anything from either of those, please post.

NeilBrown

Attachments

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help