Thread (21 messages) 21 messages, 7 authors, 2025-03-06

Re: [PATCH v4 0/2] Add stop_on_panic support for watchdog

From: Ahmad Fatoum <a.fatoum@pengutronix.de>
Date: 2025-03-05 12:24:47
Also in: chrome-platform, imx, linux-arm-kernel, linux-mips, linux-watchdog, lkml

Hello George,

On 05.03.25 13:15, George Cherian wrote:
quoted
On 05.03.25 12:28, George Cherian wrote:
quoted
quoted
that can't be disabled and would protect against system lock up: 
Consider a memory-corruption bug (perhaps externally via DMA), which partially
overwrites both main and kdump kernel. With a disabled watchdog, the system
may not be able to recover on its own.
Yes, that is the reason why the kernel command-line is optional and by default it is set to zero.
So that in cases if you have a corrupted kdump kernel then watchdog kicks in.
The existing option isn't enough for the kdump kernel use case.
If we (i.e. you) are going to do something about it, wouldn't it be
better to have a solution that's applicable to a wider number of
watchdog devices?
I need a slight clarification here. 
1. reset_on_panic takes the number of seconds to be reloaded to watchdog HW, so that it initiates a 
watchdog reset after the specified timeout, if kdump kernel fails to boot or hung while booting.
Yes.
2. in case reset_on_panic = 0 then it behaves like stop_on_panic=1.
Is this what you meant?
Alternatively, reset_on_panic = 0 could also mean stopping the watchdog as
you do now. I haven't thought though yet what would make the most sense.
I would let Guenter comment on this approach.
+1.
quoted
If you are serious with the watchdog use, you'll want to use the watchdog to
monitor kernel startup as well. If the bootloader can set a watchdog timeout
just before starting the kernel and it doesn't expire before the kernel watchdog
driver takes over, why can't we do the same just before starting the dumpkernel?
Yes, in an ideal world with ideal HW. But there are HW with issues which cannot have large
enough Watchdog time. Such HW would boot from FW to kernel without watchdog enabled.
And stop_on_panic does the similar for kdump kernel too.
Yes, but there is likely more kinds of watchdog devices that can not be disabled,
so it makes sense to have a solution that is more broadly applicable from the get-go.

Cheers,
Ahmad
-George
quoted
Thanks,
Ahmad

quoted
Thanks,
Ahmad
quoted
Changelog:
v1 -> v2
- Remove the per driver flag setting option
- Take the parameter via kernel command-line parameter to watchdog_core.

v2 -> v3
- Remove the helper function watchdog_stop_on_panic() from watchdog.h.
- There are no users for this. 

v3 -> v4
- Since the panic notifier is in atomic context, watchdog functions
  which sleep can't be called. 
- Add an options flag WDIOF_STOP_MAYSLEEP to indicate whether stop
  function sleeps.
- Simplify the stop_on_panic kernel command line parsing.
- Enable the panic notiffier only if the watchdog stop function doesn't
  sleep

George Cherian (2):
  watchdog: Add a new flag WDIOF_STOP_MAYSLEEP
  drivers: watchdog: Add support for panic notifier callback
- George

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help