Re: [ath9k-devel] [RFC] ath9k: Fix softlockup in AR9485
From: Mohammed Shafi Shajakhan <hidden>
Date: 2012-05-31 05:34:38
Hi Sujith, On Thursday 31 May 2012 08:14 AM, Sujith Manoharan wrote:
Mohammed Shafi Shajakhan wrote:quoted
before the STA is associated PLL4(0x1618c) always seem to dump zero, which causes a softlockup as we keep on polling infinitely PLL4's 8th bit. Once PLL4'S 8th bit is set indicates we can take PLL3(0x16188) readings which tells us whether PLL is locked or not. if the PLL is not locked we have a rx hang. It does not looks safe to poll PLL4 before the STA is associated. so it is safe to check this rx hang after association, where we might have a possibility of rx hang under stress testing. digging into the PLL stuff did not yield anything much better. previously the register dumped 0xdeadbeef and the hw_pll_work itself is cancelled by ath_radio_disable(obselete) fixes https://bugzilla.redhat.com/show_bug.cgi?id=811142 Reported-by: Rolf Offermanns<redacted> Cc: stable@vger.kernel.org Signed-off-by: Mohammed Shafi Shajakhan<redacted> --- drivers/net/wireless/ath/ath9k/main.c | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-)diff --git a/drivers/net/wireless/ath/ath9k/main.c b/drivers/net/wireless/ath/ath9k/main.c index dfa78e8..a03ad3e 100644 --- a/drivers/net/wireless/ath/ath9k/main.c +++ b/drivers/net/wireless/ath/ath9k/main.c@@ -970,6 +970,9 @@ void ath_hw_pll_work(struct work_struct *work) hw_pll_work.work); u32 pll_sqsum; + if (!(sc->sc_flags& SC_OP_PRIM_STA_VIF)) + return; + if (AR_SREV_9485(sc->sc_ah)) { ath9k_ps_wakeup(sc);All this is really racy, I think. In the various callsites where we issue a HW reset by queuing a work, there is no guarantee that the caller would not be invoked once again before the reset work has had a chance to kick in. And there is nothing that prevents the various poll routines from checking the HW while a reset in progress.
should we check if (work_pending(&sc->hw_reset_work)) like we do in ath_beacon_tasklet in all the other work routine that access hardware registers in case the hw_reset_work is not cancelled(possible in ath_set_channel)
And why are we using HZ for the pll_work ?
don't know, i put a printk to see the routine seems to get executed for every 200ms(HZ/5). scan work in mac80211 seems to use HZ ieee80211_queue_delayed_work(&local->hw, &local->scan_work, next_delay);
Sujith
-- thanks, shafi