Thread (13 messages) 13 messages, 4 authors, 2020-02-10

Re: RE: Re: "oneshot" interrupt causes another interrupt to be fired erroneously in Haswell system

From: Sean V Kelley <hidden>
Date: 2020-01-23 01:37:35
Also in: linux-pci, lkml

On Thu, 2020-01-16 at 11:01 +0100, Thomas Gleixner wrote:
Kar Hin Ong [off-list ref] writes:
quoted
quoted
I don't have access to the document you mentioned, but I know
that chipsets
have a knob to control that behaviour. Just checked a few chipset
docs and they
contain the same sentence, but then in the next paragraph they
say:

 "If the I/OxAPIC entry is masked (via the mask bit in the
corresponding
  Redirection Table Entry), then the corresponding PCI Express
  interrupt(s) is forwarded to the legacy ICH, provided the
Disable PCI
  INTx Routing to ICH bit is clear, Section 19.10.2.27,
QPIPINTRC: Intel
  QuickPath Interconnect Protocol Interrupt Control."

That control bit is 0 after reset, so the legacy forwarding
works.
Intel support engineer do provide similar advice to us as a
workaround
to the CPU behaviour.  They said we could enable the
"Don'tRouteToPCH"
bit in the BIOS to block the interrupt from propagating to
PCH.  This
bit is located at "Coherent Interface Protocol Interrupt Control
(cipintrc)" register of "Virtualization" device (Bus 0, Device 5,
Function 0, Offset 0x14C).

With the help of our BIOS engineer, after setting this bit in BIOS
does prevent the interrupt forwarding.

However, Intel told us that this workaround is not validated, i.e.
the
side effect of setting this bit is unknown.
What? That's ridiculous.

That bit is documented in various chipset documents and that legacy
rerouting is really just there to support OSes which do not support
multiple IO-APICs properly.

If setting this bit has unknown side effects then someone at Intel
should have a close look and fix their documentation.

Can the Intel people on Cc please take care of this?

I looked into it Thomas.  The issue is as you suggested early in the
thread.  If an IRQ arrives at line N of a non-primary IO-APIC and that
line is masked, a new IRQ is generated on the primary IO-APIC/PIC.  

The BIOS setting to address this forwarding is as above Disable INTx
Route to PCH/ICH/SouthBridge. When this bit is set, local INTx messages
received from the PCI-E ports are not routed to legacy PCH - they are
either converted into MSI via the integrated I/OxAPIC (if the I/OxAPIC
mask bit is clear in the appropriate entries) or cause no further
action (when mask bit is set).

This capability is tested and supported fully on Intel platforms.

For example, 5520 [1], Xeon E5 4600  [2] , Xeon E7 [3], and so on
include this bit :
 
[1] 
https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/5520-5500-chipset-ioh-datasheet.pdf
page 139
[2] 
https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/xeon-e5-1600-2600-vol-2-datasheet.pdf
, page 280
[3] 
https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/xeon-e7-v2-datasheet-vol-2.pdf
, page 373
etc..

Once you get to SKX/CLX things change and integrated IOxAPICs in the
IIO module convert legacy PCI Express interrupt messages into MSI
interrupts directly.  Beyond SKX/CLX there are no longer IOxAPICs in
IIO. IOxAPIC is only in the PCH. Devices connected to the
IIO will use native MSI/MSI-x mechanisms.

The problem is with the absolute lack of useful documentation.  That’s
not acceptable.  

You recall the work Olaf and Stefan did at SuSE ten years ago (?) on
boot irq quirks and the amount of research they had to do it learn
about the behavior.[4]

[4]
http://lkml.iu.edu/hypermail/linux/kernel/0807.1/3160.html
 
From a Real-Time Linux perspective this is really important to me.  As
we get closer to fully mainlined we need to have this information
readily available with greater usage of threaded irqs in combination
with legacy interrupts on the older platforms.

So I will ensure we actually create useful information pointing to this
behavior either in kernel docs or online as in a white paper or both.

As we have already quirks in drivers/pci/quirks.c which handle the
same
issue on older chipsets, we really should add one for these kind of
systems to avoid fiddling with the BIOS (which you can, but most
people
cannot).
Agreed, and I will follow-up with Kar Hin Ong to get them added.

Thanks,

Sean


Thanks,

        tglx
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help