Thread (29 messages) 29 messages, 4 authors, 2016-05-23

Crash after 'reboot' due to 9be4fd2c7723a

From: rafael@kernel.org (Rafael J. Wysocki)
Date: 2016-05-23 20:23:52
Also in: linux-pm

On Mon, May 23, 2016 at 10:16 PM, Rafael J. Wysocki [off-list ref] wrote:
On Mon, May 23, 2016 at 8:28 PM, Russell King - ARM Linux
[off-list ref] wrote:
quoted
On Sat, May 21, 2016 at 02:56:20AM +0200, Rafael J. Wysocki wrote:
quoted
The root of the problem seems to be arch_irq_work_raise() and specifically
the __smp_cross_call function that appears to have problems.
We've been through this before.  The bottom line is that on ARM, there
is major wide-spread understanding that we want to be able to run a
kernel compiled for SMP on uniprocessor hardware.

This is something that's been going for years, and has worked fine for
years.

Now, someone has introduced this irq work stuff.  Great.  But they
haven't considered that people want to be able to run a SMP kernel
on UP hardware which _may_ have no hardware present which is capable
of raising IPIs.

I don't know what the situation is with the platform concerned here,
I don't know whether it uses the GIC (presumably, because it doesn't
NULL-pointer ref on calling smp_cross_call(), it is using the GIC.)
If so, the GIC isn't delivering the IPI because the IPI hardware is
missing.

Now, whether we could detect whether the GIC is IPI capable, I've
no idea, I don't have such a platform I could run tests on.  People
don't give me hardware anymore now that arm-soc is split off from
core ARM maintanence - it all goes into Linaro build farms.  So I'm
powerless to help here.

My attitude towards this is that it's a core kernel problem: the
core kernel is assuming that it can raise IPIs even on non-SMP
capable hardware.  Much non-SMP hardware doesn't have that ability.
While folk can try to push it into arch code, my feeling is that it
really needs to have a generic solution, even if it's some generic
solution that architectures in this situation can plug into.
Well, this particular case isn't about IPIs at all.

irq_work_queue() can work without IPIs and it works like that on other
ARM platforms where SMP kernels are run on UP hardware (without IPI
support).

Something seems to be missing in kernel config or similar here, but I
can't really say what it is right away.
For one, if that platform is not capable of raising interrupts for IRQ
works, I'm not sure why arch_irq_work_has_interrupt() returns true on
it.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help