Re: [PATCH] ath10k: move irq setup
From: Kalle Valo <hidden>
Date: 2013-07-30 18:35:47
Michal Kazior [off-list ref] writes:
There was a slight race during PCI shutdown. Since interrupts weren't really stopped (only Copy Engine interrupts were disabled through device hw registers) it was possible for a firmware indication (crash) interrupt to come in after tasklets were synced/killed. This would cause memory corruption and a panic in most cases. It was also possible for interrupt to come before CE was initialized during device probing. Interrupts are required for BMI phase so they are enabled as soon as power_up() is called but are freed upon both power_down() and stop() so there's asymmetry here. As by design stop() cannot be followed by start() it is okay. Both power_down() and stop() should be merged later on to avoid confusion.
Why are the interrupts freed both in power_down() and stop()? I don't get that. What if we call disable_irq() in power_down() instead?
Before this can be really properly fixed var/hw init code split is necessary. Signed-off-by: Michal Kazior <redacted> --- Please note: this is based on my (still under review at the time of posting) previous patchests: device setup refactor and recovery. I'm posting this before those patchsets are merged so anyone interested in testing this fix (I can't reproduce the problem on my setup) can give it a try.
This was reported by Ben, right? So this sould have a Reported-by line attributing him.
quoted hunk ↗ jump to hunk
@@ -1783,16 +1792,24 @@ static int ath10k_pci_hif_power_up(struct ath10k *ar) return 0; err_ce: + /* XXX: Until var/hw init is split it's impossible to fix the ordering + * here so we must call stop_intr() here too to prevent interrupts after + * CE is teared down. It's okay to double call the stop_intr()*/
"FIXME:"
exit: + ar_pci->intr_started = ret == 0;
A bit too clever for the sake of readibility for my taste, but I guess it's ok.
quoted hunk ↗ jump to hunk
--- a/drivers/net/wireless/ath/ath10k/pci.h +++ b/drivers/net/wireless/ath/ath10k/pci.h@@ -198,6 +198,7 @@ struct ath10k_pci { * interrupts. */ int num_msi_intrs; + bool intr_started;
Adding a new state variable makes me worried. I really would prefer a solution which would not require that. Also if we call request_irq() in ath10k_pci_probe() we should also call free_irq() in ath10k_pci_remove() for symmetry. Just doing a temporary hack will most likely stay forever :) -- Kalle Valo