Re: [PATCH v4 0/2] Add stop_on_panic support for watchdog
From: Ahmad Fatoum <a.fatoum@pengutronix.de>
Date: 2025-03-05 12:24:47
Also in:
chrome-platform, imx, linux-arm-kernel, linux-mips, linux-watchdog, lkml
Hello George, On 05.03.25 13:15, George Cherian wrote:
quoted
On 05.03.25 12:28, George Cherian wrote:quoted
quoted
that can't be disabled and would protect against system lock up: Consider a memory-corruption bug (perhaps externally via DMA), which partially overwrites both main and kdump kernel. With a disabled watchdog, the system may not be able to recover on its own.Yes, that is the reason why the kernel command-line is optional and by default it is set to zero. So that in cases if you have a corrupted kdump kernel then watchdog kicks in.The existing option isn't enough for the kdump kernel use case. If we (i.e. you) are going to do something about it, wouldn't it be better to have a solution that's applicable to a wider number of watchdog devices?I need a slight clarification here. 1. reset_on_panic takes the number of seconds to be reloaded to watchdog HW, so that it initiates a watchdog reset after the specified timeout, if kdump kernel fails to boot or hung while booting.
Yes.
2. in case reset_on_panic = 0 then it behaves like stop_on_panic=1. Is this what you meant?
Alternatively, reset_on_panic = 0 could also mean stopping the watchdog as you do now. I haven't thought though yet what would make the most sense.
I would let Guenter comment on this approach.
+1.
quoted
If you are serious with the watchdog use, you'll want to use the watchdog to monitor kernel startup as well. If the bootloader can set a watchdog timeout just before starting the kernel and it doesn't expire before the kernel watchdog driver takes over, why can't we do the same just before starting the dumpkernel?Yes, in an ideal world with ideal HW. But there are HW with issues which cannot have large enough Watchdog time. Such HW would boot from FW to kernel without watchdog enabled. And stop_on_panic does the similar for kdump kernel too.
Yes, but there is likely more kinds of watchdog devices that can not be disabled, so it makes sense to have a solution that is more broadly applicable from the get-go. Cheers, Ahmad
-Georgequoted
Thanks, Ahmadquoted
Thanks, Ahmadquoted
Changelog: v1 -> v2 - Remove the per driver flag setting option - Take the parameter via kernel command-line parameter to watchdog_core. v2 -> v3 - Remove the helper function watchdog_stop_on_panic() from watchdog.h. - There are no users for this. v3 -> v4 - Since the panic notifier is in atomic context, watchdog functions which sleep can't be called. - Add an options flag WDIOF_STOP_MAYSLEEP to indicate whether stop function sleeps. - Simplify the stop_on_panic kernel command line parsing. - Enable the panic notiffier only if the watchdog stop function doesn't sleep George Cherian (2): watchdog: Add a new flag WDIOF_STOP_MAYSLEEP drivers: watchdog: Add support for panic notifier callback- George
-- Pengutronix e.K. | | Steuerwalder Str. 21 | http://www.pengutronix.de/ | 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 | Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |