Thread (3 messages) 3 messages, 3 authors, 2015-07-14

Re: [PATCH 1/3] panic: Disable crash_kexec_post_notifiers if kdump is not available

From: <hidden>
Date: 2015-07-14 17:30:13
Also in: kexec, linux-arm-kernel, linux-sh, linuxppc-dev, lkml

Possibly related (same subject, not in this thread)

On Tue, Jul 14, 2015 at 12:06:15PM -0500, Eric W. Biederman wrote:
Vivek Goyal [off-list ref] writes:
quoted
On Tue, Jul 14, 2015 at 03:48:33PM +0000, dwalker@fifo99.com wrote:
quoted
On Tue, Jul 14, 2015 at 11:40:40AM -0400, Vivek Goyal wrote:
quoted
On Tue, Jul 14, 2015 at 03:34:30PM +0000, dwalker@fifo99.com wrote:
quoted
On Tue, Jul 14, 2015 at 11:02:08AM -0400, Vivek Goyal wrote:
quoted
On Tue, Jul 14, 2015 at 01:59:19PM +0000, dwalker@fifo99.com wrote:
quoted
On Mon, Jul 13, 2015 at 08:19:45PM -0500, Eric W. Biederman wrote:
quoted
dwalker@fifo99.com writes:
quoted
On Fri, Jul 10, 2015 at 08:41:28AM -0500, Eric W. Biederman wrote:
quoted
Hidehiro Kawai [off-list ref] writes:
quoted
You can call panic notifiers and kmsg dumpers before kdump by
specifying "crash_kexec_post_notifiers" as a boot parameter.
However, it doesn't make sense if kdump is not available.  In that
case, disable "crash_kexec_post_notifiers" boot parameter so that
you can't change the value of the parameter.
Nacked-by: "Eric W. Biederman" [off-list ref]
I think it would make sense if he just replaced "kdump" with "kexec".
It would be less insane, however it still makes no sense as without
kexec on panic support crash_kexec is a noop.  So the value of the
seeting makes no difference.
Can you explain more, I don't really understand what you mean. Are you suggesting
the whole "crash_kexec_post_notifiers" feature has no value ?
Daniel,

BTW, why are you using crash_kexec_post_notifiers commandline? Why not
without it?
It was explained in the prior thread but to rehash, the notifiers are used to do a switch
over from the crashed machine to another redundant machine.
So why not detect failure using polling or issue notifications from second
kernel.

IOW, expecting that a crashed machine will be able to deliver notification
reliably is falwed to begin with, IMHO.
It's flawed to think you can kexec, but you still do it right ? I've not gotten into
the deep details of this switching process, but that's how this interface is used.
Sure. But the deal here is that users of interface know that sometimes it
can be unreliable. And in the absence of more reliable mechanism, somewhat
less reliable mechanism is fine. 
quoted
 
quoted
If a machine is failing, there are high chance it can't deliver you the
notification. Detecting that failure suing some kind of polling mechanism
might be more reliable. And it will make even kdump mechanism more
reliable so that it does not have to run panic notifiers after the crash.
I think what your suggesting is that my company should change how it's hardware works
and that's not really an option for me. This isn't a simple thing like checking over the
network if the machine is down or not, this is way more complex hardware design.
That means you are ready to live with an unreliable design. There might be
cases where notifier does not get run properly and you will not do switch
despite the fact that OS has failed. I was just trying to nudge you in
a direction which could be more reliable mechanism.
Sigh I see some deep confusion going on here.

The panic notifiers are just that panic notifiers.  They have not been
nor should they be tied to kexec.   If those notifiers force a switch
over of between machines I fail to see why you would care if it was
kexec or another panic situation that is forcing that switchover.
Hidehiro isn't fixing the failover situation on my side, he's fixing register
information collection when crash_kexec_post_notifiers is used.

Daniel
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help