Thread (12 messages) 12 messages, 6 authors, 2015-06-23

Re: [PATCH 1/3] ath9k: handle RoC abort correctly

From: Krishna Chaitanya <hidden>
Date: 2015-06-22 14:01:57

On Mon, Jun 22, 2015 at 6:32 PM, Michal Kazior [off-list ref] wrote:
On 22 June 2015 at 13:56, Krishna Chaitanya [off-list ref] wrote:
quoted
On Mon, Jun 22, 2015 at 5:13 PM, Janusz Dziedzic
[off-list ref] wrote:
quoted
In case we will get ROC abort we should not call
ieee80211_remain_on_channel_expired().

In other case I hit such warning on MIPS and
p2p negotiation failed (tested with use_chanctx=1).

ath: phy0: Starting RoC period
ath: phy0: Channel definition created: 2412 MHz
ath: phy0: Assigned next_chan to 2412 MHz
ath: phy0: Offchannel duration for chan 2412 MHz : 506632
ath: phy0: ath_chanctx_set_next: current: 2412 MHz, next: 2412 MHz
ath: phy0: Stopping current chanctx: 2412
ath: phy0: Flush timeout: 200
ath: phy0: ath_chanctx_set_next: Set channel 2412 MHz
ath: phy0: Set channel: 2412 MHz width: 0
ath: phy0: Reset to 2412 MHz, HT40: 0 fastcc: 0
ath: phy0: cur_chan: 2412 MHz, event: ATH_CHANCTX_EVENT_TSF_TIMER, state: ATH_CHANCTX_STATE_IDLE
ath: phy0: ath_offchannel_channel_change: offchannel state: ATH_OFFCHANNEL_ROC_START
ath: phy0: cur_chan: 2412 MHz, event: ATH_CHANCTX_EVENT_SWITCH, state: ATH_CHANCTX_STATE_IDLE
ath: phy0: Cancel RoC
ath: phy0: RoC aborted
ath: phy0: RoC request on vif: 00:03:7f:4e:a0:cd, type: 1 duration: 500
ath: phy0: Starting RoC period
ath: phy0: Channel definition created: 2412 MHz
ath: phy0: Assigned next_chan to 2412 MHz
ath: phy0: Offchannel duration for chan 2412 MHz : 506705
ath: phy0: ath_chanctx_set_next: current: 2412 MHz, next: 2412 MHz
ath: phy0: ath_offchannel_channel_change: offchannel state: ATH_OFFCHANNEL_ROC_START
ath: phy0: cur_chan: 2412 MHz, event: ATH_CHANCTX_EVENT_SWITCH, state: ATH_CHANCTX_STATE_IDLE
------------[ cut here ]------------
WARNING: CPU: 0 PID: 3312 at drivers/net/wireless/ath/ath9k/main.c:2319
Modules linked in: ath9k ath9k_common ath9k_hw ath mac80211 cfg80211

Signed-off-by: Janusz Dziedzic <redacted>
---
 drivers/net/wireless/ath/ath9k/channel.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/wireless/ath/ath9k/channel.c b/drivers/net/wireless/ath/ath9k/channel.c
index 2066650..e211325 100644
--- a/drivers/net/wireless/ath/ath9k/channel.c
+++ b/drivers/net/wireless/ath/ath9k/channel.c
@@ -926,7 +926,8 @@ void ath_roc_complete(struct ath_softc *sc, bool abort)

        sc->offchannel.roc_vif = NULL;
        sc->offchannel.roc_chan = NULL;
-       ieee80211_remain_on_channel_expired(sc->hw);
+       if (!abort)
+               ieee80211_remain_on_channel_expired(sc->hw);
        ath_offchannel_next(sc);
        ath9k_ps_restore(sc);
 }
If HW aborts RoC in middle, should not we inform mac80211
that RoC is expired?
Good point. The ath_roc_complete() can be called with abort=true from
ath9k_cancel_pending_offchannel() as well. I guess ath_roc_complete()
needs a "reason" argument (instead of "abort") with: expired, aborted,
cancelled values. ieee80211_remain_on_channel_expired() should be
called whenever reason != cancelled.
Agree, make sense.
By the way - is ath_roc_complete() lock protected properly? It looks
like it isn't from a quick glance. Neither sdata lock nor local->mtx
can be implied in all contexts and sc->mutex isn't always held while
it's called, hmm.. or am I missing something?
quoted
Also the we are clearing roc_vif independent of abort, so the warning
indicates that roc_complete has not come from FW, may be we should
understand that first?
There's no FW in ath9k.

The problem is the following sequence:
 1. mac80211 requests roc A
 2. mac80211 cancels roc A
   a. ath9k calls expired() and hw_roc_done work is scheduled
 3. mac80211 requests roc B
 4. mac80211 starts to process the scheduled hw_roc_done
 5. mac80211 thinks roc B has expired
 6. mac80211 requests roc C
 7. ath9k WARN_ON is hit

There's a race between (3) and (4). Depending on circumstances (3) and
(4) may be reordered so the current code doesn't fail all the time.
Ok i understand, but if we get roc_complete for B before 6, then it works
fine at least at ath9k level, C will be unblocked.

Anyways, handling the cancel case should resolve it along with proper locking.
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help