Re: [PATCH v6 4/5] PCI/ERR: Use pcie_aer_is_native() to check for native AER control
From: Shuai Xue <xueshuai@linux.alibaba.com>
Date: 2025-10-20 14:45:39
Also in:
linux-pci, lkml
在 2025/10/20 21:58, Lukas Wunner 写道:
On Mon, Oct 20, 2025 at 09:09:41PM +0800, Shuai Xue wrote:quoted
??? 2025/10/20 18:17, Lukas Wunner ??????:quoted
On Wed, Oct 15, 2025 at 10:41:58AM +0800, Shuai Xue wrote:quoted
Replace the manual checks for native AER control with the pcie_aer_is_native() helper, which provides a more robust way to determine if we have native control of AER.Why is it more robust?IMHO, the pcie_aer_is_native() helper is more robust because it includes additional safety checks that the manual approach lacks:[...]quoted
Specifically, it performs a sanity check for dev->aer_cap before evaluating native AER control.I'm under the impression that aer_cap must be set, otherwise the error wouldn't have been reported and we wouldn't be in this code path? If we can end up in this code path without aer_cap set, your patch would regress devices which are not AER-capable because it would now skip clearing of errors in the Device Status register via pcie_clear_device_status().
Hi Lukas,
You raise an excellent point about the potential regression.
The origin code is:
if (host->native_aer || pcie_ports_native) {
pcie_clear_device_status(bridge);
pci_aer_clear_nonfatal_status(bridge);
}
This code clears both the PCIe Device Status register and AER status
registers when in native AER mode.
pcie_clear_device_status() is renamed from
pci_aer_clear_device_status(). Does it intends to clear only AER error
status?
- BIT 0: Correctable Error Detected
- BIT 1: Non-Fatal Error Detected
- BIT 2: Fatal Error Detected
- BIT 3: Unsupported Request Detected
From PCIe spec, BIT 0-2 are logged for functions supporting Advanced
Error Handling.
I am not sure if we should clear BIT 3, and also BIT 6 (Emergency Power
Reduction Detected) and in case a AER error.
Thanks, Lukas
Thanks. Shuai