Thread (5 messages) 5 messages, 3 authors, 2021-12-01

Re: [regression] mhi: ath11k resume fails on some devices

From: Thorsten Leemhuis <hidden>
Date: 2021-12-01 07:34:16
Also in: ath11k, linux-arm-msm, regressions

Possibly related (same subject, not in this thread)

Hi, this is your Linux kernel regression tracker speaking, this time
looking for a status update.

On 18.11.21 18:41, Manivannan Sadhasivam wrote:
On Thu, Oct 21, 2021 at 03:33:05PM +0530, Manivannan Sadhasivam wrote:
quoted
On Tue, Oct 19, 2021 at 03:12:01PM +0300, Kalle Valo wrote:
quoted
Kalle Valo [off-list ref] writes:
quoted
(adding the new mhi list, yay)

Hi Loic,

Loic Poulain [off-list ref] writes:
quoted
quoted
Loic Poulain [off-list ref] writes:
quoted
On Thu, 16 Sept 2021 at 10:00, Kalle Valo [off-list ref] wrote:
quoted
quoted
At the moment I'm running my tests with commit 020d3b26c07a reverted and
everything works without problems. Is there a simple way to fix this? Or
maybe we should just revert the commit? Commit log and kernel logs from
a failing case below.
Do you have log of success case?
A log from a successful case in the end of email, using v5.15-rc1 plus
revert of commit 020d3b26c07abe27.
quoted
To me, the device loses power, that is why MHI resuming is failing.
Normally the device should be properly recovered/reinitialized. Before
that patch the power loss was simply not detected (or handled at
higher stack level).
Currently in ath11k we always keep the firmware running when in suspend,
this is a workaround due to problems between mac80211 and MHI stack.
IIRC the problem was something related MHI creating struct device during
resume or something like that.
Could you give a try with the attached patch? It should solve your
issue without breaking modem support.
Sorry for taking so long, but I now tested your patch on top of
v5.15-rc3 and, as expected, everything works as before with QCA6390 on
NUC x86 testbox.

Tested-by: Kalle Valo <redacted>
I doubt we will find enough time to fully debug this mhi issue anytime
soon. Can we commit Loic's patch so that this regression is resolved?
Sorry no :( Eventhough Loic's patch is working, I want to understand the
issue properly so that we could add a proper fix or patch the firmware
if possible.

Let's try to get the debug logs as I requested.
I'm able to reproduce the issue on my NUC. I'm still investigating on how to
properly fix this issue. Expect a patch soon.
Was there some progress? This issue was reported 75 days ago and still
is not fixed. From the point of the Linux kernel regression tracker I'd
say: it should not take this long. Looking back at it I wonder if
'reverted the culprit and reapply later together with a proper fix'
would have been the better strategy. I wonder if that still would be the
best way forward if no patch is forthcoming soon.

Ciao, Thorsten

#regzbot poke
quoted
quoted
At the moment I'm doing all my regression testing with commit
020d3b26c07abe27 reverted. That's a risk, I would prefer to do my
testing without any hacks.

-- 
https://patchwork.kernel.org/project/linux-wireless/list/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches
P.S.: As a Linux kernel regression tracker I'm getting a lot of reports
on my table. I can only look briefly into most of them. Unfortunately
therefore I sometimes will get things wrong or miss something important.
I hope that's not the case here; if you think it is, don't hesitate to
tell me about it in a public reply. That's in everyone's interest, as
what I wrote above might be misleading to everyone reading this; any
suggestion I gave they thus might sent someone reading this down the
wrong rabbit hole, which none of us wants.

BTW, I have no personal interest in this issue, which is tracked using
regzbot, my Linux kernel regression tracking bot
(https://linux-regtracking.leemhuis.info/regzbot/). I'm only posting
this mail to get things rolling again and hence don't need to be CC on
all further activities wrt to this regression.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help