Thread (21 messages) 21 messages, 4 authors, 2020-06-11

Re: [PATCH][v2] iommu: arm-smmu-v3: Copy SMMU table for kdump kernel

From: Bjorn Helgaas <helgaas@kernel.org>
Date: 2020-06-11 23:03:31
Also in: kexec, linux-pci

On Sun, Jun 07, 2020 at 02:00:35PM +0530, Prabhakar Kushwaha wrote:
On Thu, Jun 4, 2020 at 5:32 AM Bjorn Helgaas [off-list ref] wrote:
quoted
On Wed, Jun 03, 2020 at 11:12:48PM +0530, Prabhakar Kushwaha wrote:
quoted
On Sat, May 30, 2020 at 1:03 AM Bjorn Helgaas [off-list ref] wrote:
quoted
On Fri, May 29, 2020 at 07:48:10PM +0530, Prabhakar Kushwaha wrote:
<snip>
quoted
quoted
quoted
quoted
diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index 117c0a2b2ba4..26b908f55aef 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -66,6 +66,20 @@ static int report_error_detected(struct pci_dev *dev,
                if (dev->hdr_type != PCI_HEADER_TYPE_BRIDGE) {
                        vote = PCI_ERS_RESULT_NO_AER_DRIVER;
                        pci_info(dev, "can't recover (no
error_detected callback)\n");
+
+                       pci_save_state(dev);
+                       pci_cfg_access_lock(dev);
+
+                       /* Quiesce the device completely */
+                       pci_write_config_word(dev, PCI_COMMAND,
+                             PCI_COMMAND_INTX_DISABLE);
+                       if (!__pci_reset_function_locked(dev)) {
+                               vote = PCI_ERS_RESULT_RECOVERED;
+                               pci_info(dev, "recovered via pci level
reset\n");
+                       }
So I guess we *do* need to save the state before the reset and restore
it (either that or enumerate the device from scratch just like we
would if it had been hot-added).  I'm not really thrilled with trying
to save the state after the device has already reported an error.  I'd
rather do it earlier, maybe during enumeration, like in
pci_init_capabilities().  But I don't understand all the subtleties of
dev->state_saved, so that requires some legwork.
I tried moving pci_save_state earlier. All observations are the same
as mentioned in earlier discussions.
By "legwork", I didn't mean just trying things to see whether they
seem to work.  I meant researching the history to find out *why* it's
designed the way it is so that when we change it, we don't break
things.

For example, these commits are obviously important to understand:

  aa8c6c93747f ("PCI PM: Restore standard config registers of all devices early")
  c82f63e411f1 ("PCI: check saved state before restore")
  4b77b0a2ba27 ("PCI: Clear saved_state after the state has been restored")

I think we need to step back and separate this AER issue from the
whole SMMU table copying thing.  Then do the research and start a
new thread with a patch to fix just the AER issue.

The ARM guys would probably be grateful to be dropped from the AER
thread because it really has nothing to do with ARM.

Bjorn

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help