Re: [PATCH v2 06/16] KVM: PPC: Book3S HV: XIVE: add controls for the EQ configuration
From: Cédric Le Goater <clg@kaod.org>
Date: 2019-03-14 07:13:02
Also in:
kvm
On 3/14/19 3:32 AM, David Gibson wrote:
On Wed, Mar 13, 2019 at 10:40:19AM +0100, Cédric Le Goater wrote:quoted
On 2/26/19 6:24 AM, Paul Mackerras wrote:quoted
On Fri, Feb 22, 2019 at 12:28:30PM +0100, Cédric Le Goater wrote:quoted
These controls will be used by the H_INT_SET_QUEUE_CONFIG and H_INT_GET_QUEUE_CONFIG hcalls from QEMU. They will also be used to restore the configuration of the XIVE EQs in the KVM device and to capture the internal runtime state of the EQs. Both 'get' and 'set' rely on an OPAL call to access from the XIVE interrupt controller the EQ toggle bit and EQ index which are updated by the HW when event notifications are enqueued in the EQ. The value of the guest physical address of the event queue is saved in the XIVE internal xive_q structure for later use. That is when migration needs to mark the EQ pages dirty to capture a consistent memory state of the VM. To be noted that H_INT_SET_QUEUE_CONFIG does not require the extra OPAL call setting the EQ toggle bit and EQ index to configure the EQ, but restoring the EQ state will.[snip]quoted
+/* Layout of 64-bit eq attribute */ +#define KVM_XIVE_EQ_PRIORITY_SHIFT 0 +#define KVM_XIVE_EQ_PRIORITY_MASK 0x7 +#define KVM_XIVE_EQ_SERVER_SHIFT 3 +#define KVM_XIVE_EQ_SERVER_MASK 0xfffffff8ULL + +/* Layout of 64-bit eq attribute values */ +struct kvm_ppc_xive_eq { + __u32 flags; + __u32 qsize; + __u64 qpage; + __u32 qtoggle; + __u32 qindex; + __u8 pad[40]; +};This is confusing. What's the difference between an "eq attribute" and an "eq attribute value"? Is the first actually a queue index or a queue identifier?The "attribute" qualifier comes from the {get,set,has}_addr methods of the KVM device. But it is not a well chosen name for the group KVM_DEV_XIVE_GRP_EQ_CONFIG. I should be using "eq identifier" and "eq values" or "eq state".Yeah, that seems clearer.quoted
quoted
Also, the kvm_ppc_xive_eq is not 64 bits, so the comment above it is wrong. Maybe you meant "64-byte"?That was a bad copy paste. I have padded the structure to twice the size of the XIVE END (the XIVE EQ descriptor in HW) which size is 32 bytes. I thought that one extra u64 was not enough room for future.quoted
[snip]quoted
+ page = gfn_to_page(kvm, gpa_to_gfn(kvm_eq.qpage)); + if (is_error_page(page)) { + pr_warn("Couldn't get guest page for %llx!\n", kvm_eq.qpage); + return -ENOMEM; + } + qaddr = page_to_virt(page) + (kvm_eq.qpage & ~PAGE_MASK);Isn't this assuming that we can map the whole queue with a single gfn_to_page? That would only be true if kvm_eq.qsize <= PAGE_SHIFT. What happens if kvm_eq.qsize > PAGE_SHIFT?Ah yes. Theoretically, it should not happen because we only advertise 64K in the DT for the moment. I should at least add a check. So I will change the helper xive_native_validate_queue_size() to return -EINVAL for other page sizes.Ok.quoted
Do you think it would be complex to support XIVE EQs using a page larger than the default one on the guest ?Hm. The queue has to be physically contiguous from the host point of view, in order for the XIVE hardware to write to it, doesn't it? If so then supporting queues bigger than the guest page size would be very difficult.
The queue is only *one* page. C.