Thread (24 messages) 24 messages, 2 authors, 2023-06-14

Re: [PATCH 2/7] watchdog/hardlockup: Make the config checks more straightforward

From: Doug Anderson <dianders@chromium.org>
Date: 2023-06-07 23:35:36
Also in: linux-perf-users, lkml, sparclinux

Hi,

On Wed, Jun 7, 2023 at 8:25 AM Petr Mladek [off-list ref] wrote:
quoted hunk ↗ jump to hunk
diff --git a/arch/Kconfig b/arch/Kconfig
index 422f0ffa269e..13c6e596cf9e 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -404,17 +404,27 @@ config HAVE_NMI_WATCHDOG
        depends on HAVE_NMI
        bool
        help
-         The arch provides a low level NMI watchdog. It provides
-         asm/nmi.h, and defines its own watchdog_hardlockup_probe() and
-         arch_touch_nmi_watchdog().
+         The arch provides its own hardlockup detector implementation instead
+         of the generic perf one.
nit: did you mean to have different wording here compared to
HAVE_HARDLOCKUP_DETECTOR_ARCH? Here you say "the generic perf one" and
there you say "the generic ones", though it seems like you mean them
to be the same.
quoted hunk ↗ jump to hunk
+
+         Sparc64 defines this variable without HAVE_HARDLOCKUP_DETECTOR_ARCH.
+         It does _not_ use the command line parameters and sysctl interface
+         used by generic hardlockup detectors. Instead it is enabled/disabled
+         by the top-level watchdog interface that is common for both softlockup
+         and hardlockup detectors.

 config HAVE_HARDLOCKUP_DETECTOR_ARCH
        bool
        select HAVE_NMI_WATCHDOG
        help
-         The arch chooses to provide its own hardlockup detector, which is
-         a superset of the HAVE_NMI_WATCHDOG. It also conforms to config
-         interfaces and parameters provided by hardlockup detector subsystem.
+         The arch provides its own hardlockup detector implementation instead
+         of the generic ones.
+
+         It uses the same command line parameters, and sysctl interface,
+         as the generic hardlockup detectors.
+
+         HAVE_NMI_WATCHDOG is selected to build the code shared with
+         the sparc64 specific implementation.

 config HAVE_PERF_REGS
        bool
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 3e91fa33c7a0..d201f5d3876b 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1035,16 +1035,33 @@ config BOOTPARAM_SOFTLOCKUP_PANIC

          Say N if unsure.

+config HAVE_HARDLOCKUP_DETECTOR_BUDDY
+       bool
+       depends on SMP
+       default y
I think you simplify your life if you also add:

  depends on !HAVE_NMI_WATCHDOG

The existing config system has always assumed that no architecture
defines both HAVE_HARDLOCKUP_DETECTOR_PERF and HAVE_NMI_WATCHDOG
(symbols would have clashed and you would get a link error as two
watchdogs try to implement the same weak symbol). If you add the extra
dependency to "buddy" as per above, then a few things below fall out.
I'll try to point them out below.

+
 #
-# arch/ can define HAVE_HARDLOCKUP_DETECTOR_ARCH to provide their own hard
-# lockup detector rather than the perf based detector.
+# Global switch whether to build a hardlockup detector at all. It is available
+# only when the architecture supports at least one implementation. There are
+# two exceptions. The hardlockup detector is newer enabled on:
s/newer/never/

+#
+#      s390: it reported many false positives there
+#
+#      sparc64: has a custom implementation which is not using the common
+#              hardlockup command line options and sysctl interface.
+#
+# Note that HAVE_NMI_WATCHDOG is used to distinguish the sparc64 specific
+# implementaion. It is automatically enabled also for other arch-specific
+# variants which set HAVE_HARDLOCKUP_DETECTOR_ARCH. It makes the check
+# of avaialable and supported variants quite tricky.
 #
 config HARDLOCKUP_DETECTOR
        bool "Detect Hard Lockups"
        depends on DEBUG_KERNEL && !S390
-       depends on HAVE_HARDLOCKUP_DETECTOR_NON_ARCH || HAVE_HARDLOCKUP_DETECTOR_ARCH
+       depends on ((HAVE_HARDLOCKUP_DETECTOR_PERF || HAVE_HARDLOCKUP_DETECTOR_BUDDY) && !HAVE_NMI_WATCHDOG) || HAVE_HARDLOCKUP_DETECTOR_ARCH
Adding the dependency to buddy (see ablove) would simplify the above
to just this:

depends on HAVE_HARDLOCKUP_DETECTOR_PERF ||
HAVE_HARDLOCKUP_DETECTOR_BUDDY || HAVE_HARDLOCKUP_DETECTOR_ARCH

As per above, it's simply a responsibility of architectures not to
define that they have both "perf" if they have the NMI watchdog, so
it's just buddy to worry about.

quoted hunk ↗ jump to hunk
+       imply HARDLOCKUP_DETECTOR_PERF
+       imply HARDLOCKUP_DETECTOR_BUDDY
        select LOCKUP_DETECTOR
-       select HARDLOCKUP_DETECTOR_NON_ARCH if HAVE_HARDLOCKUP_DETECTOR_NON_ARCH

        help
          Say Y here to enable the kernel to act as a watchdog to detect
@@ -1055,9 +1072,15 @@ config HARDLOCKUP_DETECTOR
          chance to run.  The current stack trace is displayed upon detection
          and the system will stay locked up.

+#
+# Note that arch-specific variants are always preferred.
+#
 config HARDLOCKUP_DETECTOR_PREFER_BUDDY
        bool "Prefer the buddy CPU hardlockup detector"
-       depends on HAVE_HARDLOCKUP_DETECTOR_PERF && SMP
+       depends on HARDLOCKUP_DETECTOR
+       depends on HAVE_HARDLOCKUP_DETECTOR_PERF && HAVE_HARDLOCKUP_DETECTOR_BUDDY
+       depends on !HAVE_NMI_WATCHDOG
Can get rid of above "!HAVE_NMI_WATCHDOG" if it's added to
HAVE_HARDLOCKUP_DETECTOR_BUDDY.

+       default n
I'm pretty sure "default n" isn't needed.

quoted hunk ↗ jump to hunk
        help
          Say Y here to prefer the buddy hardlockup detector over the perf one.
@@ -1071,39 +1094,27 @@ config HARDLOCKUP_DETECTOR_PREFER_BUDDY

 config HARDLOCKUP_DETECTOR_PERF
        bool
-       depends on HAVE_HARDLOCKUP_DETECTOR_PERF
+       depends on HARDLOCKUP_DETECTOR
+       depends on HAVE_HARDLOCKUP_DETECTOR_PERF && !HARDLOCKUP_DETECTOR_PREFER_BUDDY
+       depends on !HAVE_NMI_WATCHDOG
We don't really need the "!HAVE_NMI_WATCHDOG". A user wouldn't be able
to mess this up and it's up to an architecture not to define both
HAVE_HARDLOCKUP_DETECTOR_PERF and HAVE_NMI_WATCHDOG.

        select HARDLOCKUP_DETECTOR_COUNTS_HRTIMER

 config HARDLOCKUP_DETECTOR_BUDDY
        bool
-       depends on SMP
+       depends on HARDLOCKUP_DETECTOR
+       depends on HAVE_HARDLOCKUP_DETECTOR_BUDDY
+       depends on !HAVE_HARDLOCKUP_DETECTOR_PERF || HARDLOCKUP_DETECTOR_PREFER_BUDDY
+       depends on !HAVE_NMI_WATCHDOG
Get rid of "!HAVE_NMI_WATCHDOG" and it should be in
HAVE_HARDLOCKUP_DETECTOR_BUDDY


Overall, though, I agree that this improves things! Thanks! :-)
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help