Thread (8 messages) 8 messages, 4 authors, 2024-11-03

Re: [net-next v2] net: wwan: t7xx: reset device if suspend fails

From: Jinjian Song <hidden>
Date: 2024-10-31 13:09:50
Also in: linux-arm-kernel, linux-doc, linux-mediatek, lkml

From: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Hi Jinjian,

On 29.10.2024 05:46, Jinjian Song wrote:
quoted
From: Sergey Ryazanov <ryazanov.s.a@gmail.com>
quoted
On 22.10.2024 11:43, Jinjian Song wrote:
quoted
If driver fails to set the device to suspend, it means that the
device is abnormal. In this case, reset the device to recover
when PCIe device is offline.
Is it a reproducible or a speculative issue? Does the fix recover 
modem from a problematic state?

Anyway we need someone more familiar with this hardware (Intel or 
MediaTek engineer) to Ack the change to make sure we are not going to 
put a system in a more complicated state.
Hi Sergey,

This is a very difficult issue to replicate onece occured and fixed.

The issue occured when driver and device lost the connection. I have
encountered this problem twice so far:
1. During suspend/resume stress test, there was a probabilistic D3L2
time sequence issue with the BIOS, result in PCIe link down, driver
read and write the register of device invalid, so suspend failed.
This issue was eventually fixed in the BIOS and I was able to restore
it through the reset module after reproducing the problem.

2. During idle test, the modem probabilistic hang up, result in PCIe
link down, driver read and write the register of device invalid, so
suspend failed. This issue was eventually fiex in device modem firmware
by adjust a certain power supply voltage, and reset modem as a workround
to restore when the MBIM port command timeout in userspace applycations.

Hardware reset modem to recover was discussed with MTK, and they said
that if we don't want to keep the on-site problem location in case of
suspend failure, we can use the recover solution.
Both the ocurred issues result in the PCIe link issue, driver can't read 
and writer the register of WWAN device, so I want to add this path
to restore, hardware reset modem can recover modem, but using the 
pci_channle_offline() as the judgment is my inference.
Thank you for the clarification. Let me summarize what I've understood 
from the explanation:
a) there were hardware (firmware) issues,
b) issues already were solved,
c) issues were not directly related to the device suspension procedure,
d) you want to implement a backup plan to make the modem support robust.

If got it right, then I would like to recommend to implement a generic 
error handling solution for the PCIe interface. You can check this 
document: Documentation/PCI/pci-error-recovery.rst
Hi Sergey,

Yes, got it right.
I want to identify the scenario and then recover by reset device,
otherwise suspend failure will aways prevent the system from suspending
if it occurs.
Suddenly, I am not an expert in the PCIe link recovery procedure, so 
I've CCed this message to PCI subsystem maintainers. I hope they can 
suggest a conceptually correct way to handle these cases.
Thanks.
quoted
quoted
quoted
Signed-off-by: Jinjian Song <redacted>
---
V2:
  * Add judgment, reset when device is offline
---
  drivers/net/wwan/t7xx/t7xx_pci.c | 4 ++++
  1 file changed, 4 insertions(+)
diff --git a/drivers/net/wwan/t7xx/t7xx_pci.c b/drivers/net/wwan/ 
t7xx/t7xx_pci.c
index e556e5bd49ab..4f89a353588b 100644
--- a/drivers/net/wwan/t7xx/t7xx_pci.c
+++ b/drivers/net/wwan/t7xx/t7xx_pci.c
@@ -427,6 +427,10 @@ static int __t7xx_pci_pm_suspend(struct pci_dev 
*pdev)
      iowrite32(T7XX_L1_BIT(0), IREG_BASE(t7xx_dev) + 
ENABLE_ASPM_LOWPWR);
      atomic_set(&t7xx_dev->md_pm_state, MTK_PM_RESUMED);
      t7xx_pcie_mac_set_int(t7xx_dev, SAP_RGU_INT);
+    if (pci_channel_offline(pdev)) {
+        dev_err(&pdev->dev, "Device offline, reset to recover\n");
+        t7xx_reset_device(t7xx_dev, PLDR);
+    }
      return ret;
  }
--
Sergey
Best Regards,
Jinjian
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help