[PATCH] generic: Add the exception case checking routine for ppi interrupt
From: Marc Zyngier <hidden>
Date: 2016-08-31 08:37:25
Also in:
lkml
On 31/08/16 07:35, majun (F) wrote:
Hi Marc & Mark: ? 2016/8/30 19:21, Mark Rutland ??:quoted
On Tue, Aug 30, 2016 at 12:07:36PM +0100, Marc Zyngier wrote:quoted
+Mark On 30/08/16 11:35, majun (F) wrote:quoted
? 2016/8/30 16:50, Marc Zyngier ??:quoted
On 30/08/16 05:17, MaJun wrote:quoted
From: Ma Jun <redacted> During system booting, if the interrupt which has no action registered is triggered, it would cause system panic when try to access the action member.And why would that interrupt be enabled? If you enable a PPI before registering a handler, you're doing something wrong.Actually,the problem described above happened during the capture kernel booting. In my system, sometimes there is a pending physical timer interrupt(30) when the first kernel panic and the status is kept until the capture kernel booting.And that's perfectly fine. The interrupt can be pending forever, as it shouldn't get enabled.quoted
So, this interrupt will be handled during capture kernel booting.Why? Who enables it?quoted
Becasue we use virt timer interrupt but not physical timer interrupt in capture kernel, the interrupt 30 has no action handler.Again: who enables this interrupt? Whichever driver enables it should be fixed.I'm also at a loss. In this case, arch_timer_uses_ppi must be VIRT_PPI. So in arch_timer_register(), we'll only request_percpu_irq the virt PPI. arch_timer_has_nonsecure_ppi() will be false, given arch_timer_uses_ppi is VIRT_PPI, so in arch_timer_starting_cpu() we'll only enable_percpu_irq() the virt PPI. We don't fiddle with arch_timer_uses_ppi after calling arch_timer_register(). So I can't see how we could enable another IRQ in this case. Looking at the driver in virt/kvm/arm/arch_timer.c, we only enable what we've succesfully requested, so it doesnt' seem like there's an issue there.quoted
From a quick look at teh GIC driver, it looks like we reset PPIscorrectly, so it doesn't look like we have a "latent enable".I just checked the status of irq 30 during capture kernel booting. The irq 30 status is: mask, pending after arch_timer_starting_cpu() called. Because irq 30 triggered only 1 time during capture kernel booting, I think this problem maybe happened in the case like: 1:irq 30 triggered, but not acked by cpu yet. 2:local_irq_disable() called 3:system reboot -->capture kernel booting 4:local_irq_enable() 5:irq 30 acked by CPU. Is this case possible?
I can't see how, because you've missed: 3b: All PPIs are disabled as each CPU comes up So for (5) to occur, I can only see two possibilities: (a) either something else is enabling the timer PPI (b) your GIC doesn't correctly retire a pending PPI that is being disabled I'm discounting (b) because I can't see how the system would work otherwise, so (a) must be happening somehow. Thanks, M. -- Jazz is not dead. It just smells funny...