Thread (14 messages) 14 messages, 2 authors, 2012-08-02

Re: IRQ issues with multiple SiI3114's on Kernel 3.2

From: Stirling Westrup <hidden>
Date: 2012-07-28 23:41:44

On Sat, Jul 28, 2012 at 1:48 PM, Stirling Westrup [off-list ref] wrote:
On Sat, Jul 28, 2012 at 5:10 AM, Stan Hoeppner [off-list ref] wrote:
quoted
On 7/27/2012 9:20 PM, Stirling Westrup wrote:
quoted
On Fri, Jul 27, 2012 at 6:14 PM, Stirling Westrup [off-list ref] wrote:
quoted
On Fri, Jul 27, 2012 at 1:24 PM, Stan Hoeppner [off-list ref] wrote:
quoted
On 7/27/2012 11:40 AM, Stirling Westrup wrote:
quoted
I recently purchased a large system for use as a backup server for a
pair of small businesses. It contains a boot drive plus 10 more
storage drives. Despite having three onboard SATA controllers, the
motherboard didn't have enough SATA connectors for all the drives, so
I installed a pair of identical SiI3114 raid cards to handle the extra
connections. It has a Sandy Bridge chipset, so I installed a 3.2
kernel.

# uname -a
Linux ttt 3.2.0-0.bpo.2-amd64 #1 SMP Fri Jun 29 20:42:29 UTC 2012
x86_64 GNU/Linux
...
quoted
Okay, enough background. Here's the issue: I had no trouble building
and sync'ing the first array, but when I try to sync the second array,
I always get the following dmesg an hour or so into the process:

irq 19: nobody cared (try booting with the "irqpoll" option)
[  346.120572] Pid: 1100, comm: md1_resync Not tainted
3.2.0-0.bpo.2-amd64 #1
quoted
[  346.120573] Call Trace:
...
[  346.120697] handlers:
[  346.120699] [<ffffffffa00479e0>] ahci_interrupt
[  346.120702] [<ffffffffa02f17ec>] sil_interrupt
[  346.120703] Disabling IRQ #19
[  346.122145] sched: RT throttling activated
...
quoted
From this point onward syncing drops to a tiny fraction of its
previous speed. I've tried booting with 'irqpoll' as the error message
suggests, but it has had no effect. I'm really not sure if there is a
conflict between my two SiI3114's or between the SiI's and the Marvell
controller (although I've never had an issue with Marvell in the
past), nor how to go about diagnosing or fixing this.  I'll include a
full dmesg dump below, as well as my currently loaded modules. If
anyone wants any further info, just ask.
Have you tried irqbalance to spread the interrupts across cores/cache
domains? https://irqbalance.org/documentation.html
Thanks for the tip! I installed irqbalance and rebooted the system,
and everything has been running smoothly for the last two hours. I'll
let everyone know tomorrow if it actually finished the full 20-hour
resync without incidence.
quoted
Alas, all it did was delay the IRQ error by a few hours. Does anyone
else have any ideas about how I could tackle this?
Try irqpoll and irqbalance together.  Also, which motherboard is this,
exact make/model please.  May be a BIOS issue.  Doesn't seem to be using
MSIs.  If the mobo and cards all support MSIs, enabling that may fix
this as well.

Also what make/model are the SiI3114 cards?  PCIe or PCI?
The motherboard is an Asus P8768-V Pro/Gen3 and has full MSI support,
but the cards
are labeled "Syba SiI 3114 PCI to 4 Port Sata 150" and don't support MSI.
quoted
Have you
tried different slot combinations?  Moving one card to a different slot
may get it routed to PCI INTB instead of INTA.  That may get it mapped
to something other than IRQ#19.  Updating the 3114 boards to their
latest firmware is worth a shot, if not there already.
At one point the system was mapping the interrupt to IRQ#17, instead
of 19, but it still failed.

I haven't tried moving the cards to different slots or anything but,
IIRC, it only has two PCI slots.

I also have yet to try upgrading the BIOS of the mobo or updating the
card firmware. I was hoping to have a better
idea of what was going wrong before going down that route.

I also wonder if it the problem could be kernel or libata related.
(which is why I'm asking in this forum).
Okay, it looks like its a known hardware chipset problem, and was
first reported 6-months ago.

It affects all PCI cards in Asus Sandy-Bridge Motherboards. No known
fix as of yet.

https://lkml.org/lkml/2012/1/30/216
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help