Thread (5 messages) 5 messages, 4 authors, 2018-12-17

Re: [RFC RESEND PATCH] kvm: arm64: export memory error recovery capability to user space

From: gengdongjiu <hidden>
Date: 2018-12-14 22:31:15
Also in: kvm, linux-doc, lkml

HI James,

      Thanks for the mail and comments, I will reply to you in the next mail.

2018-12-14 21:55 GMT+08:00, James Morse [off-list ref]:
Hi Dongjiu Geng,

On 14/12/2018 10:15, Dongjiu Geng wrote:
quoted
When user space do memory recovery, it will check whether KVM and
guest support the error recovery, only when both of them support,
user space will do the error recovery. This patch exports this
capability of KVM to user space.
I can understand user-space only wanting to do the work if host and guest
support the feature. But 'error recovery' isn't a KVM feature, its a Linux
kernel feature.

KVM will send it's user-space a SIGBUS with MCEERR code whenever its trying
to
map a page at stage2 that the kernel-mm code refuses this because its
poisoned.
(e.g. check_user_page_hwpoison(), get_user_pages() returns -EHWPOISON)

This is exactly the same as happens to a normal user-space process.

I think you really want to know if the host kernel was built with
CONFIG_MEMORY_FAILURE. The not-at-all-portable way to tell this from
user-space
is the presence of /proc/sys/vm/memory_failure_* files.
(It looks like the prctl():PR_MCE_KILL/PR_MCE_KILL_GET options silently
update
an ignored policy if the kernel isn't built with CONFIG_MEMORY_FAILURE, so
they
aren't helpful)

quoted
diff --git a/Documentation/virtual/kvm/api.txt
b/Documentation/virtual/kvm/api.txt
index cd209f7..241e2e2 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -4895,3 +4895,12 @@ Architectures: x86
 This capability indicates that KVM supports paravirtualized Hyper-V IPI
send
 hypercalls:
 HvCallSendSyntheticClusterIpi, HvCallSendSyntheticClusterIpiEx.
+
+8.21 KVM_CAP_ARM_MEMORY_ERROR_RECOVERY
+
+Architectures: arm, arm64
+
+This capability indicates that guest memory error can be detected by the
KVM which
+supports the error recovery.
KVM doesn't detect these errors.
The hardware detects them and notifies the OS via one of a number of
mechanisms.
This gets plumbed into memory_failure(), which sets a flag that the mm code
uses
to prevent the page being used again.

KVM is only involved when it tries to map a page at stage2 and the mm code
rejects it with -EHWPOISON. This is the same as the architectures
do_page_fault() checking for (fault & VM_FAULT_HWPOISON) out of
handle_mm_fault(). We don't have a KVM cap for this, nor do we need one.

quoted
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index b72a3dd..90d1d9a 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -82,6 +82,7 @@ int kvm_arch_vm_ioctl_check_extension(struct kvm *kvm,
long ext)
 		r = kvm_arm_support_pmu_v3();
 		break;
 	case KVM_CAP_ARM_INJECT_SERROR_ESR:
+	case KVM_CAP_ARM_MEMORY_ERROR_RECOVERY:
 		r = cpus_have_const_cap(ARM64_HAS_RAS_EXTN);
 		break;
The CPU RAS Extensions are not at all relevant here. It is perfectly
possible to
support memory-failure without them, AMD-Seattle and APM-X-Gene do this.
These
systems would report not-supported here, but the kernel does support this
stuff.
Just because the CPU supports this, doesn't mean the kernel was built with
CONFIG_MEMORY_FAILURE. The CPU reports may be ignored, or upgraded to
SIGKILL.



Thanks,

James
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help