Re: am335x: cpsw: interrupt failure
From: Felipe Balbi <hidden>
Date: 2014-12-29 17:13:55
Also in:
linux-omap
On Mon, Dec 29, 2014 at 08:51:04AM -0800, Tony Lindgren wrote:
* Felipe Balbi [off-list ref] [141229 07:53]:quoted
On Mon, Dec 29, 2014 at 10:33:26AM +0100, Yegor Yefremov wrote:quoted
On Fri, Dec 12, 2014 at 8:19 PM, Yegor Yefremov [off-list ref] wrote:quoted
On Fri, Dec 12, 2014 at 6:32 PM, Felipe Balbi [off-list ref] wrote:quoted
Hi, On Fri, Dec 12, 2014 at 01:00:51PM +0100, Yegor Yefremov wrote:quoted
U-Boot version: 2014.07 Kernel config is omap2plus with enabled USB # cat /proc/version Linux version 3.18.0 (user@user-VirtualBox) (gcc version 4.8.3 20140320 (prerelease) (Sourcery CodeBench Lite 2014.05-29) ) #6 SMP Mon Dec 8 22:47:43 CET 2014Wasn't GCC 4.8.x total crap for building ARM kernels ? IIRC it was even blacklisted. Can you try with 4.9.x just to make sure ?Will do.Adding linux-omap. Beginning of this discussion: http://comments.gmane.org/gmane.linux.network/341427 Quick summary: starting with kernel 3.18 or commit 55601c9f24670ba926ebdd4d712ac3b177232330 am335x (at least BBB and some custom boards) stalls at high network load. Reproducible via nuttcp within some minutes nuttcp -S (on BBB) nuttcp -t -N 4 -T30m 192.168.1.235 (on host) As Felipe Balbi suggested, I tried both 4.8.3 and 4.9.2 toolchains, but both show the same behavior. Linux version 3.18.0 (user@user-VirtualBox) (gcc version 4.8.3 20140320 (prerelease) (Sourcery CodeBench Lite 2014.05-29) ) #6 SMP Mon Dec 8 22:47:43 CET 2014 Linux version 3.18.1 (user@user-VirtualBox) (gcc version 4.9.2 (Buildroot 2015.02-git-00582-g10b9761) ) #1 SMP Mon Dec 29 09:22:29 CET 2014 Let me know, if you can reproduce this issue.finally managed to reproduce this, it took quite a bit of effort though. I'll see if I can gether more information about the problem.Maybe check if the irqnr is 127 (or the last reserved interrupt) in irq-omap-intc.c. If so, also print out the previous interrupt. It seems the intc uses the last reserved interrupt to signal a spurious interrupt for the previous irqnr, so we should probably add some handling for that. If the previous interrupt is a cpsw interrupt, then there's probably something wrong with cpsw interrupt handling. Either a missing read-back to flush posted write in the cpsw interrupt handler, or the EOI registers are written at a wrong time.
yeah, I'll go over it, but I first need to reproduce it again. Just rebooted to try again and after half an hour, couldn't reproduce it anymore. Interesting race to end the year :-) cheers -- balbi
Attachments
- signature.asc [application/pgp-signature] 819 bytes