Thread (15 messages) 15 messages, 3 authors, 2018-03-13

[PATCH] arm64: kdump: fix interrupt handling done during machine_crash_shutdown

From: Grzegorz Jaszczyk <hidden>
Date: 2018-03-02 12:59:34
Also in: lkml

2018-03-02 13:05 GMT+01:00 Mark Rutland [off-list ref]:
On Fri, Mar 02, 2018 at 12:56:24PM +0100, Grzegorz Jaszczyk wrote:
quoted
Thank you for your feedback. I probably over-interpreted some of the
documentation paragraph to justify (probably) buggy behavior that I am
seeing. Regardless of correctness of this patch I will appreciate if
you could help understanding this issue.

First the whole story: I was debugging why the crashdump kernel hangs
in v. early stage, when the kdump was triggered from the
ARM_SBSA_WATCHDOG interrupt handler, while everything worked fine when
it was triggered from the process context. Finally It occurred that it
is because the crashdump kernel doesn't get any timer interrupt. I
also notice that this problem doesn't occur when the gic is configured
to work in EOImode == 1. In such circumstances, the write to
GIC_CPU_EOI in gic_handle_irq is causing priority drop to idle, and
therefore when the crashdump kernel starts, the timer interrupt is
able to preempt still active watchdog interrupt (I know that this
interrupt shouldn't be active after irq_set_irqchip_state but for some
reason it seems to not do the job correctly).
Do you have a way to reproduce the problem?

Is there an easy way to cause the watchdog to trigger a kdump as above,
e.g. via LKDTM?
You can reproduce this problem by:
- enabling CONFIG_ARM_SBSA_WATCHDOG in your kernel
- passing via command-line: sbsa_gwdt.action=1 sbsa_gwdt.timeout=170
- then load/prepare crasdump kernel (I am doing it via kexec tool)
- echo 1 > /dev/watchdog

and after 170s the watchdog interrupt will hit triggering panic and
the whole kexec machinery will run. The sbsa_gwdt.timeout can't be too
small since it is also used for reset:
|----timeout-----(panic)----timeout-----reset.
If it is too small the crasdump kernel will not have enough time to start.

It is also reproducible with different interrupts, e.g. for test I put
the panic to i2c interrupt handler and it was behaving the same.

To use gic with EOImode == 0 mode, you can fulfill some of
gic_check_eoimode ( irqchip/irq-gic.c) conditions or just for test
"return false;" in this function.

I think you just mean GICv2 here. GICv2m is an MSI controller, and
shouldn't interact with the SBSA watchdog's SPI.
Yes of course, I just wanted to mention that it has MSI controller.

Thank you,
Grzegorz
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help