Re: [linux-next] cpus stalls detected few hours after booting next kernel
From: Abdul Haleem <hidden>
Date: 2017-07-07 11:06:25
Also in:
linux-next, lkml
On Fri, 2017-06-30 at 17:28 +1000, Nicholas Piggin wrote:
On Fri, 30 Jun 2017 10:52:18 +0530 Abdul Haleem [off-list ref] wrote:quoted
On Fri, 2017-06-30 at 00:45 +1000, Nicholas Piggin wrote:quoted
On Thu, 29 Jun 2017 20:23:05 +1000 Nicholas Piggin [off-list ref] wrote:quoted
On Thu, 29 Jun 2017 19:36:14 +1000 Nicholas Piggin [off-list ref] wrote:quoted
quoted
I don't *think* the replay-wakeup-interrupt patch is directly involved, but it's likely to be one of the idle patches.Okay this turned out to be misconfigured sleep states I added for the simulator, sorry for the false alarm.quoted
Although you have this in the backtrace. I wonder if that's a stuck lock in rcu_process_callbacks?So this spinlock becomes top of the list of suspects. Can you try enabling lockdep and try to reproduce it?Yes, recreated again with CONFIG_LOCKDEP=y & CONFIG_DEBUG_LOCKDEP=y set. I do not see any difference in trace messages with and without LOCKDEP enabled. Please find the attached log file.Can you get an rcu_invoke_callback event trace that Paul suggested?
Yes, I was able to collect the perf data for rcu_invoke_callback event on recent next kernel (4.12.0-next-20170705). the issue is rare to hit. After booting the next kernel, I started this command 'perf record -e rcu:rcu_invoke_callback -a -g -- cat' and waited for 30 minutes. five minutes after seeing the stalls messages, I did CTRL-C to end the perf command. @Nicholas : the perf.data report is too huge to attach here, shall I ping you the internal location of file on slack/mail ? Also the machine is in the same state if you want to use it ?
Does this bug show up with just the powerpc next branch? Thanks, Nick
-- Regard's Abdul Haleem IBM Linux Technology Centre