Thread (25 messages) 25 messages, 7 authors, 2016-07-30

Re: [Query] Preemption (hogging) of the work handler

From: Jan Kara <jack@suse.cz>
Date: 2016-07-14 14:12:25
Also in: lkml

Possibly related (same subject, not in this thread)

On Wed 13-07-16 14:45:07, Sergey Senozhatsky wrote:
Cc Petr Mladek.

On (07/12/16 16:19), Viresh Kumar wrote:
[..]
quoted
Okay, we have tracked this BUG and its really interesting.
good find!
quoted
I hacked the platform's serial driver to implement a putchar() routine
that simply writes to the FIFO in polling mode, that helped us in
tracing on where we are going wrong.

The problem is that we are running asynchronous printks and we call
wake_up_process() from the last running CPU which has disabled
interrupts. That takes us to: try_to_wake_up().

In our case the CPU gets deadlocked on this line in try_to_wake_up().

        raw_spin_lock_irqsave(&p->pi_lock, flags);
yeah, printk() can't handle these types of recursion. it can prevent
printk() calls issued from within the logbuf_lock spinlock section,
with some limitations:

	if (unlikely(logbuf_cpu == smp_processor_id())) {
		recursion_bug = true;
		return;
	}

	raw_spin_lock(&logbuf_lock);
	logbuf_cpu = this_cpu;
		...
	logbuf_cpu = UINT_MAX;
	raw_spin_unlock(&logbuf_lock);

so should, for instance, raw_spin_unlock() generate spin_dump(), printk()
will blow up (both sync and async), because logbuf_cpu is already reset.
it may look that async printk added another source of recursion - wake_up().
but, apparently, this is not exactly correct. because there is already a
wake_up() call in console_unlock() - up().

	printk()
	 if (logbuf_cpu == smp_processor_id())
		return;

         raw_spin_lock(&logbuf_lock);
	 logbuf_cpu = this_cpu;
	 ...
	 logbuf_cpu = UINT_MAX;
         raw_spin_unlock(&logbuf_lock);

	 console_trylock()
	   raw_spin_lock_irqsave(&sem->lock)      << ***
	   raw_spin_unlock_irqsave(&sem->lock)    << ***

	 console_unlock()
          up()
	   raw_spin_lock_irqsave(&sem->lock)  << ***
	    __up()
	     wake_up_process()
	      try_to_wake_up()  << *** in may places


*** a printk() call from here will kill the system. either it will
recurse printk(), or spin forever in 'nested' printk() on one of
the already taken spin locks.
Exactly. Calling printk() from certain parts of the kernel (like scheduler
code or timer code) has been always unsafe because printk itself uses these
parts and so it can lead to deadlocks. That's why printk_deffered() has
been introduced as you mention below.

And with sync printk the above deadlock doesn't trigger only by chance - if
there happened to be a waiter on console_sem while we suspend, the same
deadlock would trigger because up(&console_sem) will try to wake him up and
the warning in timekeeping code will cause recursive printk.

So I think your patch doesn't really address the real issue - it only
works around the particular WARN_ON(timekeeping_enabled) warning but if
there was a different warning in timekeeping code which would trigger, it
has a potential for causing recursive printk deadlock (and indeed we had
such issues previously - see e.g. 504d58745c9c "timer: Fix lock inversion
between hrtimer_bases.lock and scheduler locks").

So there are IMHO two issues here worth looking at:

1) I didn't find how a wakeup would would lead to calling to ktime_get() in
the current upstream kernel or even current RT kernel. Maybe this is a
problem specific to the 3.10 kernel you are using? If yes, we don't have to
do anything for current upstream AFAIU.

If I just missed how wakeup can call into ktime_get() in current upstream,
there is another question:

2) Is it OK that printk calls wakeup so late during suspend? I believe it
is but I'm neither scheduler nor suspend expert. If it is OK, and wakeup
can lead to ktime_get() in current upstream, then this contradicts the
check WARN_ON(timekeeping_suspended) in ktime_get() and something is wrong.

Adding Thomas to CC as timer / RT expert...

								Honza
quoted hunk ↗ jump to hunk
so... I think we can switch to sync printk mode in suspend_console() and
enable async printk from resume_console(). IOW, suspend/kexec are now
executed under sync printk mode.

we already call console_unlock() during suspend, which is synchronous,
many times (e.g. console_cpu_notify()).


something like below, perhaps. will this work for you?

---
 kernel/printk/printk.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index bbb4180..786690e 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -288,6 +288,11 @@ static u32 log_buf_len = __LOG_BUF_LEN;
 
 /* Control whether printing to console must be synchronous. */
 static bool __read_mostly printk_sync = true;
+/*
+ * Force sync printk mode during suspend/kexec, regardless whether
+ * console_suspend_enabled permits console suspend.
+ */
+static bool __read_mostly force_printk_sync;
 /* Printing kthread for async printk */
 static struct task_struct *printk_kthread;
 /* When `true' printing thread has messages to print */
@@ -295,7 +300,7 @@ static bool printk_kthread_need_flush_console;
 
 static inline bool can_printk_async(void)
 {
-	return !printk_sync && printk_kthread;
+	return !printk_sync && printk_kthread && !force_printk_sync;
 }
 
 /* Return log buffer address */
@@ -2027,6 +2032,7 @@ static bool suppress_message_printing(int level) { return false; }
 
 /* Still needs to be defined for users */
 DEFINE_PER_CPU(printk_func_t, printk_func);
+static bool __read_mostly force_printk_sync;
 
 #endif /* CONFIG_PRINTK */
 
@@ -2163,6 +2169,8 @@ MODULE_PARM_DESC(console_suspend, "suspend console during suspend"
  */
 void suspend_console(void)
 {
+	force_printk_sync = true;
+
 	if (!console_suspend_enabled)
 		return;
 	printk("Suspending console(s) (use no_console_suspend to debug)\n");
@@ -2173,6 +2181,8 @@ void suspend_console(void)
 
 void resume_console(void)
 {
+	force_printk_sync = false;
+
 	if (!console_suspend_enabled)
 		return;
 	down_console_sem();
-- 
2.9.0.rc1
-- 
Jan Kara [off-list ref]
SUSE Labs, CR
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help