Re: [PATCH 2/2] ACPI: extlog: Trace CPER PCI Express Error Section
From: Dan Williams <hidden>
Date: 2024-12-11 01:52:08
Also in:
linux-acpi, linux-pci, lkml
Fabio M. De Francesco wrote:
On Tuesday, August 6, 2024 9:56:24 PM GMT+2 Dan Williams wrote:quoted
Fabio M. De Francesco wrote:quoted
Currently, extlog_print() (ELOG) only reports CPER PCIe section (UEFI v2.10, Appendix N.2.7) to the kernel log via print_extlog_rcd().I think the critical detail is is that print_extlog_rcd() is only triggered when ras_userspace_consumers() returns true. The observation is that ras_userspace_consumers() hides information from the trace path when the intended purpose of it was to hide duplicate emissions to the kernel log when userspace is watching the tracepoints. Setting aside whether ras_userspace_consumers() is still a good idea or not, it is obvious that this patch as is may surprise environments that start seeing kernel error logs where the kernel was silent before. I think the path of least surprise would be to make sure that pci_print_aer() optionally skips emitting to the kernel log when not needed wanted.Sorry for replying so late... I'm not entirely sure that users would not prefer to be surprised by _finally_ seeing kernel error logs for failing PCIe components. I suspect that users might have been confused by not seeing any output.
2 notes: * New KERN_ERR prints are often found to be unwelcome. When the kernel starts printing new error messages it causes sysadmins to scramble. * The future of RAS is trace-events. Any new RAS messages to the kernel log need to ask the question, "is userspace better served by registering for a RAS trace event, rather than parsing kernel log messsages". [..]
I need to be sure that I understood...
void pci_print_aer(char *level, struct pci_dev *dev, int aer_severity,
struct aer_capability_regs *aer)
{
[...]
if (printk_get_level(level) <= console_loglevel) {
pci_err(dev, "aer_status: 0x%08x, aer_mask: 0x%08x\n",
status, mask);
No, the code would be:
pci_printk(level, dev, "aer_status: 0x%08x, aer_mask: 0x%08x\n", status, mask);
...i.e. just pass @level rather than open code "if
(printk_get_level(level) <= console_loglevel)".