Re: [PATCH] tpm_tis: Disable interrupts on ThinkPad T490s
From: Jarkko Sakkinen <jarkko@kernel.org>
Date: 2020-11-24 03:30:21
Also in:
lkml
On Tue, Nov 24, 2020 at 05:27:30AM +0200, Jarkko Sakkinen wrote:
On Thu, Nov 19, 2020 at 03:42:35PM +0100, Hans de Goede wrote:quoted
Hi, On 11/19/20 7:36 AM, Jerry Snitselaar wrote:quoted
Matthew Garrett @ 2020-10-15 15:39 MST:quoted
On Thu, Oct 15, 2020 at 2:44 PM Jerry Snitselaar [off-list ref] wrote:quoted
There is a misconfiguration in the bios of the gpio pin used for the interrupt in the T490s. When interrupts are enabled in the tpm_tis driver code this results in an interrupt storm. This was initially reported when we attempted to enable the interrupt code in the tpm_tis driver, which previously wasn't setting a flag to enable it. Due to the reports of the interrupt storm that code was reverted and we went back to polling instead of using interrupts. Now that we know the T490s problem is a firmware issue, add code to check if the system is a T490s and disable interrupts if that is the case. This will allow us to enable interrupts for everyone else. If the user has a fixed bios they can force the enabling of interrupts with tpm_tis.interrupts=1 on the kernel command line.I think an implication of this is that systems haven't been well-tested with interrupts enabled. In general when we've found a firmware issue in one place it ends up happening elsewhere as well, so it wouldn't surprise me if there are other machines that will also be unhappy with interrupts enabled. Would it be possible to automatically detect this case (eg, if we get more than a certain number of interrupts in a certain timeframe immediately after enabling the interrupt) and automatically fall back to polling in that case? It would also mean that users with fixed firmware wouldn't need to pass a parameter.I believe Matthew is correct here. I found another system today with completely different vendor for both the system and the tpm chip. In addition another Lenovo model, the L490, has the issue. This initial attempt at a solution like Matthew suggested works on the system I found today, but I imagine it is all sorts of wrong. In the 2 systems where I've seen it, there are about 100000 interrupts in around 1.5 seconds, and then the irq code shuts down the interrupt because they aren't being handled.Is that with your patch? The IRQ should be silenced as soon as devm_free_irq(chip->dev.parent, priv->irq, chip); is called. Depending on if we can get your storm-detection to work or not, we might also choose to just never try to use the IRQ (at least on x86 systems). AFAIK the TPM is never used for high-throughput stuff so the polling overhead should not be a big deal (and I'm getting the feeling that Windows always polls). Regards, HansYeah, this is what I've been wondering for a while. Why could not we just strip off IRQ code? Why does it matter?
And we DO NOT use interrupts in tpm_crb and nobody has ever complained. /Jarkko