Thread (11 messages) 11 messages, 4 authors, 2017-11-14

Re: [PATCH v2 1/3] powerpc/powernv: Always stop secondaries before reboot/shutdown

From: Nicholas Piggin <npiggin@gmail.com>
Date: 2017-11-11 06:00:16

On Fri, 10 Nov 2017 22:08:32 +1100
Michael Ellerman [off-list ref] wrote:
Nicholas Piggin [off-list ref] writes:
quoted
Currently powernv reboot and shutdown requests just leave secondaries
to do their own things. This is undesirable because they can trigger
any number of watchdogs while waiting for reboot, but also we don't
know what else they might be doing, or they might be stuck somewhere
causing trouble.

The opal scheduled flash update code already ran into watchdog problems
due to flashing taking a long time, but it's possible for regular
reboots to trigger problems too (this is with watchdog_thresh set to 1,
but I have seen it with watchdog_thresh at the default value once too):

  reboot: Restarting system
  [  360.038896709,5] OPAL: Reboot request...
  Watchdog CPU:0 Hard LOCKUP
  Watchdog CPU:44 detected Hard LOCKUP other CPUS:16
  Watchdog CPU:16 Hard LOCKUP
  watchdog: BUG: soft lockup - CPU#16 stuck for 3s! [swapper/16:0]

So remove the special case for flash update, and unconditionally do
smp_send_stop before rebooting.

Return the CPUs to Linux stop loops rather than OPAL. The reason for
this is that the path to firmware is longer, and the CPUs may have
been interrupted from firmware, which may cause problems to re-enter
it. It's better to put them into a simple spin loop to maximize the
chance of a successful reboot.  
I always assumed we had to send the CPUs back to OPAL for the flashing
procedure. Is it OK to leave them in Linux?
According to the comment and changelog

2196c6f1ed66eef23df3b478cfe71661ae83726e

It was added just to keep secondaries from going silly. Vasant, can
you remember details?

Thanks,
Nick
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help