Thread (16 messages) 16 messages, 4 authors, 2012-11-16

Re: [3.6.6] panic on reboot / khungtaskd blocked? (WARNING: at arch/x86/kernel/smp.c:123 native_smp_send_reschedule)

From: Michael Wang <hidden>
Date: 2012-11-14 03:09:14
Also in: lkml

On 11/14/2012 10:49 AM, Robert Hancock wrote:
On 11/13/2012 08:32 PM, Michael Wang wrote:
quoted
On 11/13/2012 05:40 PM, Paweł Sikora wrote:
quoted
On Monday 12 of November 2012 13:33:39 Paweł Sikora wrote:
quoted
On Monday 12 of November 2012 11:22:47 Paweł Sikora wrote:
quoted
On Monday 12 of November 2012 15:40:31 Michael Wang wrote:
quoted
On 11/12/2012 03:16 PM, Paweł Sikora wrote:
quoted
On Monday 12 of November 2012 11:04:12 Michael Wang wrote:
quoted
On 11/09/2012 09:48 PM, Paweł Sikora wrote:
quoted
Hi,

during playing with new ups i've caught an nice oops on reboot:

http://imgbin.org/index.php?page=image&id=10253

probably the upstream is also affected.
Hi, Paweł

Are you using a clean 3.6.6 without any modify?
yes, pure 3.6.6 form git tree with modular config.
quoted
Looks like some threads has set itself to be UNINTERRUPTIBLE
with out
any design on switch itself back later(or the time is too long),
are you
accidentally using some bad designed module?
hmm, hard to say. mostly all modules are loaded automatically by
kernel.
Could you please provide the whole dmesg in text? your picture
lost the
print info of the hung task.
i've grabbed the console via rs232 but there's no more info (see
attached txt).
hmm, i have one observation.

during rc.shutdown there're messages on console like this: Cannot
stat file /proc/$pid/fd/1: Connection timed out
afaics this file descriptor points to vnc log file on a remote
machine, e.g.:

# ps aux|grep xfwm4
eda       1748  0.0  0.0 320220 11224 ?        S    13:08   0:00 xfwm4

# readlink -m /proc/1748/fd/1
/remote/dragon/ahome/eda/.vnc/odra:11.log

# mount|grep ahome
dragon:/home/users/ on /remote/dragon/ahome type nfs
(rw,relatime,vers=3,rsize=262144,wsize=262144,namlen=255,soft,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.0.2.121,mountvers=3,mountport=45251,mountproto=udp,local_lock=none,addr=10.0.2.121)



so, probably during `killall5 -TERM/-KILL` on shutdown stage
something sometimes go wrong
and these processes (xfce4/vncserver) survive the signal and hang on
the nfs i/o.
ok, now i have full sysrq+w backtraces from shutdown process. i hope
i'll help you.
This can only tell us what's the task in UNINTERRUPTABLE state, but with
out time info, we can't find out which one is the hung task...
Probably all of the ones in D state waiting on NFS are the issue - but
as I understand it, with modern kernels processes are supposed to be
killable while waiting on NFS I/O. Maybe there's a bug that affects
this, though?
That sounds possible, I thing Paweł can try to stop using NFS(if
possible) and take a look, if the issue disappear, then it's time to
report the bug to NFS folks.

Regards,
Michael Wang
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help