Thread (12 messages) 12 messages, 4 authors, 2017-07-25

Re: [linux-next] cpus stalls detected few hours after booting next kernel

From: Paul McKenney <hidden>
Date: 2017-06-29 15:47:35
Also in: linux-next

[ Adding my IMAP address on CC to avoid another Lotus Notes reply to LKML.
Man, it has been more than 15 years since the last one!!! ]

Nicholas Piggin [off-list ref] wrote on 06/29/2017 07:45:34 AM:
On Thu, 29 Jun 2017 20:23:05 +1000
Nicholas Piggin [off-list ref] wrote:
quoted
On Thu, 29 Jun 2017 19:36:14 +1000
Nicholas Piggin [off-list ref] wrote:
quoted
quoted
I don't *think* the replay-wakeup-interrupt patch is directly
involved, but
quoted
quoted
it's likely to be one of the idle patches.
Okay this turned out to be misconfigured sleep states I added for the
simulator, sorry for the false alarm.
quoted
Although you have this in the backtrace. I wonder if that's a stuck
lock in rcu_process_callbacks?
So this spinlock becomes top of the list of suspects. Can you try
enabling lockdep and try to reproduce it?
Hopefully lockdep would give you the answer.

But if that doesn't help, the rcu_invoke_callback event trace would
identify the function that the stuck callback invoked.

							Thanx, Paul
quoted
[ 5948.345539] CPU: 63 PID: 7360 Comm: rs:main Q:Reg Not tainted \
4.12.0-rc7-next-20170628 #2 [ 5948.345612] task: c000000f1c14e600
task.stack: \
quoted
c000000f1c1e8000 [ 5948.345672] NIP: c0000000009c7a10 LR:
c0000000009c7a08 CTR: \
quoted
c00000000015eda0 [ 5948.358553] REGS: c000000f1c1eb150 TRAP: 0501
Not tainted  \
quoted
(4.12.0-rc7-next-20170628) [ 5948.358623] MSR: 9000000000009033 \
<SF,HV,EE,ME,IR,DR,RI,LE> [ 5948.358626]   CR: 28082884  XER: 20000000
[ 5948.358706] CFAR: c0000000009c7a28 SOFTE: 1
GPR00: c00000000015f14c c000000f1c1eb3d0 c000000001062b00
0000000000000001
quoted
GPR04: c000000fff6e6180 0000000000000000 0000000000000001
00000000000000cc
quoted
GPR08: 0000000000000001 000000008000004a 0000000000000000
0000000000000000
quoted
GPR12: c00000000015eda0 c00000000fd55a80
[ 5948.358986] NIP [c0000000009c7a10] _raw_spin_lock_irqsave+0x90/0x100
[ 5948.359043] LR [c0000000009c7a08] _raw_spin_lock_irqsave+0x88/0x100
[ 5948.359099] Call Trace:
[ 5948.359123] [c000000f1c1eb3d0] [0000000000000001] 0x1 (unreliable)
[ 5948.359182] [c000000f1c1eb410] [c00000000015f14c] \
d> rcu_process_callbacks+0x3ac/0x620 [ 5948.359252] [c000000f1c1eb4c0] \
quoted
[c0000000000e1e0c] __do_softirq+0x14c/0x3a0 [ 5948.365958]
[c000000f1c1eb5b0] \
quoted
[c0000000000e2448] irq_exit+0x108/0x120 [ 5948.366016]
[c000000f1c1eb5d0] \
quoted hunk ↗ jump to hunk
quoted
[c0000000000232b4] timer_interrupt+0xa4/0xe0 [ 5948.366075]
[c000000f1c1eb600] \
quoted
[c000000000009208] decrementer_common+0x158/0x160 [ 5948.366149]
--- interrupt: 901 \
quoted
at start_this_handle+0xd0/0x4b0  LR = jbd2__journal_start+0x17c/0x2b0
[ 5948.366242] [c000000f1c1eb8f0] [c000000f20a5cb00]
0xc000000f20a5cb00 (unreliable)
quoted
[ 5948.366314] [c000000f1c1eba00] [c00000000040717c]
jbd2__journal_start+0x17c/0x2b0
quoted
[ 5948.366388] [c000000f1c1eba70] [c00000000038edf4] \
__ext4_journal_start_sb+0x84/0x180 [ 5948.366459] [c000000f1c1ebad0] \
[c0000000003b47dc] ext4_da_write_begin+0x17c/0x520 [ 5948.366532]
[c000000f1c1ebb90] \
quoted
[c00000000021f9c8] generic_perform_write+0xe8/0x250 [ 5948.366604]
[c000000f1c1ebc20] \
quoted
[c000000000220d20] __generic_file_write_iter+0x200/0x240 [ 5948.366677]
\
quoted
[c000000f1c1ebc80] [c00000000039d614] ext4_file_write_iter+0x2e4/
0x4d0 [ 5948.373255] \
quoted
[c000000f1c1ebd00] [c0000000002e13c0] __vfs_write+0x120/0x200 [
5948.373313] \
quoted
[c000000f1c1ebd90] [c0000000002e2c48] vfs_write+0xc8/0x240
[ 5948.373371] \
quoted
[c000000f1c1ebde0] [c0000000002e4940] SyS_write+0x60/0x110
[ 5948.373430] \
quoted
[c000000f1c1ebe30] [c00000000000b8e0] system_call+0x38/0xdc
[ 5948.373486] \
quoted
Instruction dump: [ 5948.373521] 7fe3fb78 e8010010 eba1ffe8
ebc1fff0 ebe1fff8 \
quoted
7c0803a6 4e800020 8bad028a  [ 5948.373592] 7fe3fb78 4b64db15
60000000 7c210b78 \
quoted
<e92d0000> 89290009 792affe3 40820048  [ 5948.374515] Sending NMI
from CPU 74 to CPUs \
quoted
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help