Re: [PATCH v6 3/5] PCI/AER: Report fatal errors of RCiEP and EP if link recoverd
From: Lukas Wunner <lukas@wunner.de>
Date: 2025-10-20 14:24:54
Also in:
linux-pci, lkml
From: Lukas Wunner <lukas@wunner.de>
Date: 2025-10-20 14:24:54
Also in:
linux-pci, lkml
On Mon, Oct 20, 2025 at 10:17:10PM +0800, Shuai Xue wrote:
void aer_report_frozen_error(struct pci_dev *dev)
{
struct aer_err_info info;
if (dev->pci_type != PCI_EXP_TYPE_ENDPOINT &&
dev->pci_type != PCI_EXP_TYPE_RC_END)
return;
aer_info_init(&info);
aer_add_error_device(&info, dev);
info.severity = AER_FATAL;
if (aer_get_device_error_info(&info, 0, true))
aer_print_error(&info, 0);
/* pci_dev_put() pairs with pci_dev_get() in aer_add_error_device() */
pci_dev_put(dev);
}Much better. Again, I think you don't need to rename add_error_device() and then the code comment even fits on the same line: pci_dev_put(dev); /* pairs with pci_dev_get() in add_error_device() */
quoted
quoted
.slot_reset() => pci_restore_state() => pci_aer_clear_status()This was added in 2015 by b07461a8e45b. The commit claims that the errors are stale and can be ignored. It turns out they cannot. So maybe pci_restore_state() should print information about the errors before clearing them?While that could work, we would lose the error severity information at
Wait, we've got that saved in pci_cap_saved_state, so we could restore the severity register, report leftover errors, then clear those errors? Thanks, Lukas