Thread (30 messages) 30 messages, 3 authors, 2025-12-16

Re: [PATCH v6 3/5] PCI/AER: Report fatal errors of RCiEP and EP if link recoverd

From: Lukas Wunner <lukas@wunner.de>
Date: 2025-10-20 14:24:54
Also in: linux-pci, lkml

On Mon, Oct 20, 2025 at 10:17:10PM +0800, Shuai Xue wrote:
void aer_report_frozen_error(struct pci_dev *dev)
{
    struct aer_err_info info;

    if (dev->pci_type != PCI_EXP_TYPE_ENDPOINT &&
        dev->pci_type != PCI_EXP_TYPE_RC_END)
        return;

    aer_info_init(&info);
    aer_add_error_device(&info, dev);
    info.severity = AER_FATAL;
    if (aer_get_device_error_info(&info, 0, true))
        aer_print_error(&info, 0);

    /* pci_dev_put() pairs with pci_dev_get() in aer_add_error_device() */
    pci_dev_put(dev);
}
Much better.  Again, I think you don't need to rename add_error_device()
and then the code comment even fits on the same line:

	pci_dev_put(dev);  /* pairs with pci_dev_get() in add_error_device() */
quoted
quoted
   .slot_reset()
     => pci_restore_state()
       => pci_aer_clear_status()
This was added in 2015 by b07461a8e45b.  The commit claims that
the errors are stale and can be ignored.  It turns out they cannot.

So maybe pci_restore_state() should print information about the
errors before clearing them?
While that could work, we would lose the error severity information at
Wait, we've got that saved in pci_cap_saved_state, so we could restore
the severity register, report leftover errors, then clear those errors?

Thanks,

Lukas
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help