Re: [PATCH 2/7] watchdog/hardlockup: Make the config checks more straightforward
From: Doug Anderson <dianders@chromium.org>
Date: 2023-06-07 23:35:36
Also in:
linux-perf-users, lkml, sparclinux
Hi, On Wed, Jun 7, 2023 at 8:25 AM Petr Mladek [off-list ref] wrote:
quoted hunk ↗ jump to hunk
diff --git a/arch/Kconfig b/arch/Kconfig index 422f0ffa269e..13c6e596cf9e 100644 --- a/arch/Kconfig +++ b/arch/Kconfig@@ -404,17 +404,27 @@ config HAVE_NMI_WATCHDOG depends on HAVE_NMI bool help - The arch provides a low level NMI watchdog. It provides - asm/nmi.h, and defines its own watchdog_hardlockup_probe() and - arch_touch_nmi_watchdog(). + The arch provides its own hardlockup detector implementation instead + of the generic perf one.
nit: did you mean to have different wording here compared to HAVE_HARDLOCKUP_DETECTOR_ARCH? Here you say "the generic perf one" and there you say "the generic ones", though it seems like you mean them to be the same.
quoted hunk ↗ jump to hunk
+ + Sparc64 defines this variable without HAVE_HARDLOCKUP_DETECTOR_ARCH. + It does _not_ use the command line parameters and sysctl interface + used by generic hardlockup detectors. Instead it is enabled/disabled + by the top-level watchdog interface that is common for both softlockup + and hardlockup detectors. config HAVE_HARDLOCKUP_DETECTOR_ARCH bool select HAVE_NMI_WATCHDOG help - The arch chooses to provide its own hardlockup detector, which is - a superset of the HAVE_NMI_WATCHDOG. It also conforms to config - interfaces and parameters provided by hardlockup detector subsystem. + The arch provides its own hardlockup detector implementation instead + of the generic ones. + + It uses the same command line parameters, and sysctl interface, + as the generic hardlockup detectors. + + HAVE_NMI_WATCHDOG is selected to build the code shared with + the sparc64 specific implementation. config HAVE_PERF_REGS booldiff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 3e91fa33c7a0..d201f5d3876b 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug@@ -1035,16 +1035,33 @@ config BOOTPARAM_SOFTLOCKUP_PANIC Say N if unsure. +config HAVE_HARDLOCKUP_DETECTOR_BUDDY + bool + depends on SMP + default y
I think you simplify your life if you also add: depends on !HAVE_NMI_WATCHDOG The existing config system has always assumed that no architecture defines both HAVE_HARDLOCKUP_DETECTOR_PERF and HAVE_NMI_WATCHDOG (symbols would have clashed and you would get a link error as two watchdogs try to implement the same weak symbol). If you add the extra dependency to "buddy" as per above, then a few things below fall out. I'll try to point them out below.
+ # -# arch/ can define HAVE_HARDLOCKUP_DETECTOR_ARCH to provide their own hard -# lockup detector rather than the perf based detector. +# Global switch whether to build a hardlockup detector at all. It is available +# only when the architecture supports at least one implementation. There are +# two exceptions. The hardlockup detector is newer enabled on:
s/newer/never/
+#
+# s390: it reported many false positives there
+#
+# sparc64: has a custom implementation which is not using the common
+# hardlockup command line options and sysctl interface.
+#
+# Note that HAVE_NMI_WATCHDOG is used to distinguish the sparc64 specific
+# implementaion. It is automatically enabled also for other arch-specific
+# variants which set HAVE_HARDLOCKUP_DETECTOR_ARCH. It makes the check
+# of avaialable and supported variants quite tricky.
#
config HARDLOCKUP_DETECTOR
bool "Detect Hard Lockups"
depends on DEBUG_KERNEL && !S390
- depends on HAVE_HARDLOCKUP_DETECTOR_NON_ARCH || HAVE_HARDLOCKUP_DETECTOR_ARCH
+ depends on ((HAVE_HARDLOCKUP_DETECTOR_PERF || HAVE_HARDLOCKUP_DETECTOR_BUDDY) && !HAVE_NMI_WATCHDOG) || HAVE_HARDLOCKUP_DETECTOR_ARCHAdding the dependency to buddy (see ablove) would simplify the above to just this: depends on HAVE_HARDLOCKUP_DETECTOR_PERF || HAVE_HARDLOCKUP_DETECTOR_BUDDY || HAVE_HARDLOCKUP_DETECTOR_ARCH As per above, it's simply a responsibility of architectures not to define that they have both "perf" if they have the NMI watchdog, so it's just buddy to worry about.
quoted hunk ↗ jump to hunk
+ imply HARDLOCKUP_DETECTOR_PERF + imply HARDLOCKUP_DETECTOR_BUDDY select LOCKUP_DETECTOR - select HARDLOCKUP_DETECTOR_NON_ARCH if HAVE_HARDLOCKUP_DETECTOR_NON_ARCH help Say Y here to enable the kernel to act as a watchdog to detect@@ -1055,9 +1072,15 @@ config HARDLOCKUP_DETECTOR chance to run. The current stack trace is displayed upon detection and the system will stay locked up. +# +# Note that arch-specific variants are always preferred. +# config HARDLOCKUP_DETECTOR_PREFER_BUDDY bool "Prefer the buddy CPU hardlockup detector" - depends on HAVE_HARDLOCKUP_DETECTOR_PERF && SMP + depends on HARDLOCKUP_DETECTOR + depends on HAVE_HARDLOCKUP_DETECTOR_PERF && HAVE_HARDLOCKUP_DETECTOR_BUDDY + depends on !HAVE_NMI_WATCHDOG
Can get rid of above "!HAVE_NMI_WATCHDOG" if it's added to HAVE_HARDLOCKUP_DETECTOR_BUDDY.
+ default n
I'm pretty sure "default n" isn't needed.
quoted hunk ↗ jump to hunk
help Say Y here to prefer the buddy hardlockup detector over the perf one.@@ -1071,39 +1094,27 @@ config HARDLOCKUP_DETECTOR_PREFER_BUDDY config HARDLOCKUP_DETECTOR_PERF bool - depends on HAVE_HARDLOCKUP_DETECTOR_PERF + depends on HARDLOCKUP_DETECTOR + depends on HAVE_HARDLOCKUP_DETECTOR_PERF && !HARDLOCKUP_DETECTOR_PREFER_BUDDY + depends on !HAVE_NMI_WATCHDOG
We don't really need the "!HAVE_NMI_WATCHDOG". A user wouldn't be able to mess this up and it's up to an architecture not to define both HAVE_HARDLOCKUP_DETECTOR_PERF and HAVE_NMI_WATCHDOG.
select HARDLOCKUP_DETECTOR_COUNTS_HRTIMER
config HARDLOCKUP_DETECTOR_BUDDY
bool
- depends on SMP
+ depends on HARDLOCKUP_DETECTOR
+ depends on HAVE_HARDLOCKUP_DETECTOR_BUDDY
+ depends on !HAVE_HARDLOCKUP_DETECTOR_PERF || HARDLOCKUP_DETECTOR_PREFER_BUDDY
+ depends on !HAVE_NMI_WATCHDOGGet rid of "!HAVE_NMI_WATCHDOG" and it should be in HAVE_HARDLOCKUP_DETECTOR_BUDDY Overall, though, I agree that this improves things! Thanks! :-)