Re: [PATCH v3] x86/fault: Send a SIGBUS to user process always for hwpoison page access.
From: Andy Lutomirski <luto@amacapital.net>
Date: 2021-03-02 02:28:54
Also in:
lkml
From: Andy Lutomirski <luto@amacapital.net>
Date: 2021-03-02 02:28:54
Also in:
lkml
On Mar 1, 2021, at 11:02 AM, Luck, Tony [off-list ref] wrote: quoted
Some programs may use read(2), write(2), etc as ways to check if memory is valid without getting a signal. They might not want signals, which means that this feature might need to be configurable.That sounds like an appalling hack. If users need such a mechanism we should create some better way to do that.
Appalling hack or not, it works. So, if we’re going to send a signal to user code that looks like it originated from a bina fide architectural recoverable fault, it needs to be recoverable. A load from a failed NVDIMM page is such a fault. A *kernel* load is not. So we need to distinguish it somehow.
An aeon ago ACPI created the RASF table as a way for the OS to ask the platform to scan a block of physical memory using the patrol scrubber in the memory controller. I never did anything with it in Linux because it was just too complex and didn't know of any use cases. Users would want to check virtual addresses. Perhaps some new option MADV_CHECKFORPOISON to madvise(2) ? -Tony