Re: [PATCHv3 2/2] drm/amdgpu: Register MCE notifier for Aldebaran RAS
From: Yazen Ghannam <yazen.ghannam@amd.com>
Date: 2021-09-24 19:46:34
Also in:
amd-gfx, lkml
From: Yazen Ghannam <yazen.ghannam@amd.com>
Date: 2021-09-24 19:46:34
Also in:
amd-gfx, lkml
On Thu, Sep 23, 2021 at 08:14:15PM +0200, Borislav Petkov wrote:
On Thu, Sep 23, 2021 at 05:23:21PM +0000, Yazen Ghannam wrote:quoted
Shouldn't the error still be reported to EDAC for decoding and counting? I think users want this.You know what happens with users getting ECCs reported, right? They think immediately their hw is going bad and start wanting to replace it... So what does actually tell you if you were a simple user and you had 5 correctable errors in the GPU VRAM?
I agree with you in general. But this device isn't really a GPU. And users of this device seem to want to count *every* error, at least for now.
All you wanna do is play, I'd say. :-)
Definitely. :) Thanks, Yazen