Thread (18 messages) 18 messages, 5 authors, 2024-01-27

RE: [PATCH v2 0/2] Update mce_record tracepoint

From: "Luck, Tony" <tony.luck@intel.com>
Date: 2024-01-26 20:49:08
Also in: linux-edac, lkml

quoted
Is it so very different to add this to a trace record so that rasdaemon
can have feature parity with mcelog(8)?
I knew you were gonna say that. When someone decides that it is
a splendid idea to add more stuff to struct mce then said someone would
want it in the tracepoint too.

And then we're back to my original question:

"And where does it end? Stick full dmesg in the tracepoint too?"

Where do you draw the line in the sand and say, no more, especially
static, fields bloating the trace record should be added and from then
on, you should go collect the info from that box. Something which you're
supposed to do anyway.
Every patch that adds new code or data structures adds to the kernel
memory footprint. Each should be considered on its merits. The basic
question being:

   "Is the new functionality worth the cost?"

Where does it end? It would end if Linus declared:

  "Linux is now complete. Stop sending patches".

I.e. it is never going to end.

If somebody posts a patch asking to add the full dmesg to a
tracepoint, I'll stand with you to say: "Not only no, but hell no".

So for Naik's two patches we have:

1) PPIN
Cost = 8 bytes.
Benefit: Emdeds a system identifier into the trace record so there
can be no ambiguity about which machine generated this error.
Also definitively indicates which socket on a multi-socket system.

2) MICROCODE
Cost = 4 bytes
Benefit: Certainty about the microcode version active on the core
at the time the error was detected.

RAS = Reliability, Availability, Serviceability

These changes fall into the serviceability bucket. They make it
easier to diagnose what went wrong.


-Tony
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help