答复: [PATCH v6 4/7] arm64: kvm: support user space to query RAS extension feature
From: gengdongjiu <hidden>
Date: 2017-09-08 14:34:27
Also in:
kvm, kvmarm, linux-acpi
Hi James, Thanks a lot for your detailed comments. CC Peter. Peter is Qemu expert. Let us see his suggestion.
Hi gengdongjiu, On 05/09/17 08:18, gengdongjiu wrote:quoted
On 2017/9/1 2:04, James Morse wrote:quoted
On 28/08/17 11:38, Dongjiu Geng wrote:quoted
Userspace will want to check if the CPU has the RAS extension.... but user-space wants to know if it can inject SErrors with a specified ESR. What if we gain another way of doing this that isn't via the RAS-extensions, now user-space has to check for two capabilities.quoted
If it has, it wil specify the virtual SError syndrome value, otherwise it will not be set. This patch adds support for querying the availability of this extension.I'm against telling user-space what features the CPU has unless it can use them directly. In this case we are talking about a KVM API, so we should describe the API not the CPU.shenglong (zhaoshenglong at huawei.com) who is Qemu maintainer suggested checking the CPU RAS-extensions to decide whether generate the APEI table and record CPER for the guest OS in the user space. he means if the host does not support RAS, user space may also not support RAS.The code to signal memory-failure to user-space doesn't depend on the CPU's RAS-extensions. If Qemu supports notifying the guest about RAS errors using CPER records, it should generate a HEST describing firmware first. It can then choose the notification methods, some of which may require optional KVM APIs to support. Seattle has a HEST, it doesn't support the CPU RAS-extensions. The kernel can notify user-space about memory_failure() on this machine. I would expect Qemu to be able to receive signals and describe memory errors to a guest (1). The question should be: 'How can Qemu know it can use SEI as a firmware-first notification?' It needs a KVM API to trigger an SError in the guest with a specified ESR. The name of the KVM CAP needs to reflect the API (2). Just because this is the first KVM API that needs the CPU to have the RAS extensions doesn't mean we should call it 'has RAS' and be done with it. We will eventually need another KVM API to configure trapping and emulating values in the RAS ERR registers so that Qemu can emulate a machine without firmware-first. (This is likely to be a page of memory that backs the registers, there will need to be another KVM CAP to describe this support (3)). Exposing the CPUs support for RAS-extensions to support (2) means having per-platform support for (1). This is either creating extra work, or not supporting as many platforms as we could. Both are bad. Once we have (3) as well, any developer needs to know that 'has RAS' just meant the first API KVM implemented using RAS, and doesn't mean later APIs also using RAS are supported by the kernel.
Hi Peter/ shenglong, What is your idea about it? We may need to consult with you about it.
Thanks, James