Thread (33 messages) 33 messages, 6 authors, 2021-09-27

Re: [PATCHv3 2/2] drm/amdgpu: Register MCE notifier for Aldebaran RAS

From: Yazen Ghannam <yazen.ghannam@amd.com>
Date: 2021-09-24 19:46:34
Also in: amd-gfx, lkml

On Thu, Sep 23, 2021 at 08:14:15PM +0200, Borislav Petkov wrote:
On Thu, Sep 23, 2021 at 05:23:21PM +0000, Yazen Ghannam wrote:
quoted
Shouldn't the error still be reported to EDAC for decoding and counting? I
think users want this.
You know what happens with users getting ECCs reported, right? They
think immediately their hw is going bad and start wanting to replace
it...

So what does actually tell you if you were a simple user and you had 5
correctable errors in the GPU VRAM?
I agree with you in general. But this device isn't really a GPU. And users of
this device seem to want to count *every* error, at least for now.
All you wanna do is play, I'd say.

:-)
Definitely. :)

Thanks,
Yazen
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help