Thread (16 messages) 16 messages, 4 authors, 2014-12-30

Re: am335x: cpsw: interrupt failure

From: Yegor Yefremov <hidden>
Date: 2014-12-10 20:58:44

On Wed, Dec 10, 2014 at 6:17 PM, Felipe Balbi [off-list ref] wrote:
Hi,

On Fri, Dec 05, 2014 at 11:03:44AM +0100, Yegor Yefremov wrote:
quoted
On Thu, Dec 4, 2014 at 5:56 PM, Felipe Balbi [off-list ref] wrote:
quoted
Hi,

On Thu, Dec 04, 2014 at 05:41:38PM +0100, Yegor Yefremov wrote:
quoted
I have following problem. My systems reboots at high network load
after this commit (found via git bissect):

commit 55601c9f24670ba926ebdd4d712ac3b177232330
Author: Felipe Balbi [off-list ref]
Date:   Mon Sep 8 17:54:58 2014 -0700

    arm: omap: intc: switch over to linear irq domain

    now that we don't need to support legacy board-files,
    we can completely switch over to a linear irq domain
    and make use of irq_alloc_domain_generic_chips() to
    allocate all generic irq chips for us.

    Signed-off-by: Felipe Balbi [off-list ref]
    Signed-off-by: Tony Lindgren [off-list ref]

and I get following error messages:

irq 0, desc: cf004000, depth: 1, count: 0, unhandled: 0
irq 0 ? Weird, that's not a valid IRQ.
quoted
->handle_irq():  c0087fc0, handle_bad_irq+0x0/0x258
->irq_data.chip(): c08e7174, no_irq_chip+0x0/0x68
->action():   (null)
   IRQ_NOPROBE set
 IRQ_NOREQUEST set
irq 0, desc: cf004000, depth: 1, count: 0, unhandled: 0
->handle_irq():  c0087fc0, handle_bad_irq+0x0/0x258
->irq_data.chip(): c08e7174, no_irq_chip+0x0/0x68
->action():   (null)
   IRQ_NOPROBE set
 IRQ_NOREQUEST set
irq 0, desc: cf004000, depth: 1, count: 0, unhandled: 0
->handle_irq():  c0087fc0, handle_bad_irq+0x0/0x258
->irq_data.chip(): c08e7174, no_irq_chip+0x0/0x68
->action():   (null)
   IRQ_NOPROBE set
 IRQ_NOREQUEST set
irq 0, desc: cf004000, depth: 1, count: 0, unhandled: 0
->handle_irq():  c0087fc0, handle_bad_irq+0x0/0x258
->irq_data.chip(): c08e7174, no_irq_chip+0x0/0x68
->action():   (null)

My system: am335x with fast ethernet on the first slave and gigabit
Ethernet on second CPSW slave. This issue occurs, when I ran nuttcp
with default settings.

With commit above I can at least see these messages, but 3.18-rc7 for
example reboots without any messages.

Any idea?
if you take v3.18-rc7 and just revert that commit, does the problem go
away ?
git revert failed as the driver has more changes meanwhile or I'm
missing some params. I've tried to force the driver to use legacy
routines, but then I don't get pass U-Boot's "Starting kernel ..." See
attached patch.

Compiler used:

Linux version 3.18.0-rc7 (...) (gcc version 4.8.3 20140320
(prerelease) (Sourcery CodeBench Lite 2014.05-29) ) #309 SMP Fri Dec 5
10:59:38 CET 2014

Btw, what am335x based hardware do you have? I can run tests on both
BBB and am335x-evmsk.
coming back to this. I have BBB only. Can you provide some extra
information on how I can trigger this problem here ?
I have basically two am335x based boards, where I can trigger this
problem via nuttcp (I think iperf would do the job too). The first
system stalls almost immediately, the second one was working for about
7 minutes. I have tried the same kernel on am335x-evmsk - and this
system didn't stall. I could provide dts files for both systems.

I've tried to reduce my dts as much as I could to match  am335x-evmsk
dts, I have even removed entries for the PMIC, but still the system
stalls. Btw PMIC's INT line is connected to a GPIO pin on processor.

I've used omap2plus_defconfig for all 3 devices. Any other info I can supply?

Yegor
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help