Thread (83 messages) 83 messages, 7 authors, 2016-01-11
STALE3795d

[PATCH v8 20/20] KVM: ARM64: Add a new kvm ARM PMU device

From: Andrew Jones <hidden>
Date: 2016-01-08 11:22:13
Also in: kvm, kvmarm

On Fri, Jan 08, 2016 at 10:53:03AM +0800, Shannon Zhao wrote:

On 2016/1/8 4:18, Andrew Jones wrote:
quoted
On Tue, Dec 22, 2015 at 04:08:15PM +0800, Shannon Zhao wrote:
quoted
From: Shannon Zhao <redacted>

Add a new kvm device type KVM_DEV_TYPE_ARM_PMU_V3 for ARM PMU. Implement
the kvm_device_ops for it.

Signed-off-by: Shannon Zhao <redacted>
---
 Documentation/virtual/kvm/devices/arm-pmu.txt |  24 +++++
 arch/arm64/include/uapi/asm/kvm.h             |   4 +
 include/linux/kvm_host.h                      |   1 +
 include/uapi/linux/kvm.h                      |   2 +
 virt/kvm/arm/pmu.c                            | 128 ++++++++++++++++++++++++++
 virt/kvm/kvm_main.c                           |   4 +
 6 files changed, 163 insertions(+)
 create mode 100644 Documentation/virtual/kvm/devices/arm-pmu.txt
diff --git a/Documentation/virtual/kvm/devices/arm-pmu.txt b/Documentation/virtual/kvm/devices/arm-pmu.txt
new file mode 100644
index 0000000..dda864e
--- /dev/null
+++ b/Documentation/virtual/kvm/devices/arm-pmu.txt
@@ -0,0 +1,24 @@
+ARM Virtual Performance Monitor Unit (vPMU)
+===========================================
+
+Device types supported:
+  KVM_DEV_TYPE_ARM_PMU_V3         ARM Performance Monitor Unit v3
+
+Instantiate one PMU instance for per VCPU through this API.
                                ^^ don't need this 'for' here
quoted
+
+Groups:
+  KVM_DEV_ARM_PMU_GRP_IRQ
+  Attributes:
+    The attr field of kvm_device_attr encodes one value:
+    bits:     | 63 .... 32 | 31 ....  0 |
+    values:   |  reserved  | vcpu_index |
Everywhere else in kvm documentation a vcpu_index is 8 bits. I'm not
saying that that's good, but expanding it to 32 bits here is
inconsistent. 
Expand it to 32 bits just in case it needs to support more than 256
vcpus in the future. If you think this is not good, I'll change it to 8
bits.
I agree that eventually we'll want more than 256 vcpus. However the kvm
api already has the concept of a vcpu-index, which shows up in two other
places, KVM_IRQ_LINE and KVM_DEV_ARM_VGIC_*. In both those other places
it's only 8 bits. While there may not be a KVM_VCPU_INDEX_MASK=0xff
somewhere, I still think we should keep things consistent.

Hmm, or each device should define its own index. Actually, that sounds
better, as each device may have a different max-vcpus limit; KVM_IRQ_LINE
has 8 bits due to the x86 apic supporting 8 bits, gicv2 could have gotten
by with only 3 bits, and gicv3 can use much, much more. When we want more
than 256 vcpus we'll need a new kvm-irq-line ioctl, but, as for the arm
devices, we could avoid the problem completely by extending the
SET/GET_DEVICE_ATTR ioctl to also be a vcpu ioctl.
(A side note is that I think we should s/vcpu_index/vcpu_id/
quoted
in order to keep the parameter name consistent with what's used by
KVM_CREATE_VCPU)
Eh, I make it consistent with vGIC which is a kvm device that's also
created via KVM_CREATE_DEVICE API. So they are different names but
express same thing.
I knew you are consistent with arm-vgic.txt, and also with KVM_IRQ_LINE.
I was just thinking that it'd be nice for everything to be consistent
with KVM_CREATE_VCPU. However, now I'm of the mind that the vcpu-id
(32 bits) should always be independent of device vcpu-indexes, as we
don't want to limit the number of vcpus (by limiting vcpu-ids) to the
maximum of the most-limited device we support.
quoted
quoted
+    A value describing the PMU overflow interrupt number for the specified
+    vcpu_index vcpu. This interrupt could be a PPI or SPI, but for one VM the
+    interrupt type must be same for each vcpu. As a PPI, the interrupt number is
+    same for all vcpus, while as a SPI it must be different for each vcpu.
+
+  Errors:
+    -ENXIO: Unsupported attribute group
Do we need to specify ENXIO here? It's already specified in
Documentation/virtual/kvm/api.txt for SET/GET_DEVICE_ATTR
But specifying it here is not bad, right? If you insist on this, I'll
drop it.
It's fine, as the specification in SET/GET_DEVICE_ATTR section isn't
going to change, but we require users of this api to read that section
already, so I feel it's redundant and prefer to avoid redundancy. I won't
insist.
quoted
quoted
+    -EBUSY: The PMU overflow interrupt is already set
+    -ENODEV: Getting the PMU overflow interrupt number while it's not set
+    -EINVAL: Invalid vcpu_index or PMU overflow interrupt number supplied
diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
index 2d4ca4b..cbb9022 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -204,6 +204,10 @@ struct kvm_arch_memory_slot {
 #define KVM_DEV_ARM_VGIC_GRP_CTRL	4
 #define   KVM_DEV_ARM_VGIC_CTRL_INIT	0
 
+/* Device Control API: ARM PMU */
+#define KVM_DEV_ARM_PMU_GRP_IRQ		0
+#define   KVM_DEV_ARM_PMU_CPUID_MASK	0xffffffffULL
I don't think we should need a mask like this, at least not in the uapi.
The documentation says to use a vcpu-index [really a vcpu-id], and users
should know how to convert their vcpu handle (whatever that may be) to
a vcpu-id using whatever transformation is necessary. They may already
have a "vcpu id mask" defined (this is one reason why I don't think we
should use a 32-bit vcpu-index here, instead of the 8-bit vcpu-index used
everywhere else). Likewise, kvm should know how to do it's transformation.
Maybe, to aid in the attr field construction, we should supply a shift,
allowing both sides to do something like

int pmu_attr_extract_vcpu_id(u64 attr)
{
    return (attr >> KVM_DEV_ARM_PMU_VCPUID_SHIFT) & VCPU_ID_MASK;
So what's the value of the VCPU_ID_MASK? I didn't see any definitions
about it. It looks like KVM_DEV_ARM_PMU_CPUID_MASK(just has a difference
between 32 bits and 8 bits)
Whether or not a user would define a VCPU_ID_MASK == 0xff depends on
whether or not they understand 'vcpu-index' to be a consistent concept
everywhere they see it in the documentation. I now would prefer
vcpu-index being device specific. Going that way, then I think the
documentation could be updated to help make that clearer, e.g. calling
the gic vcpu-index gic_vcpu_index or something.
quoted
}

u64 pmu_attr_create(int vcpu_id)
{
    return vcpu_id << KVM_DEV_ARM_PMU_VCPUID_SHIFT;
}

But, in this case the shift is zero, so it's not really necessary. In
any case, please add the 'V' for VCPU.
quoted
+
 /* KVM_IRQ_LINE irq field index values */
 #define KVM_ARM_IRQ_TYPE_SHIFT		24
 #define KVM_ARM_IRQ_TYPE_MASK		0xff
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index c923350..608dea6 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1161,6 +1161,7 @@ extern struct kvm_device_ops kvm_mpic_ops;
 extern struct kvm_device_ops kvm_xics_ops;
 extern struct kvm_device_ops kvm_arm_vgic_v2_ops;
 extern struct kvm_device_ops kvm_arm_vgic_v3_ops;
+extern struct kvm_device_ops kvm_arm_pmu_ops;
 
 #ifdef CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT
 
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 03f3618..4ba6fdd 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1032,6 +1032,8 @@ enum kvm_device_type {
 #define KVM_DEV_TYPE_FLIC		KVM_DEV_TYPE_FLIC
 	KVM_DEV_TYPE_ARM_VGIC_V3,
 #define KVM_DEV_TYPE_ARM_VGIC_V3	KVM_DEV_TYPE_ARM_VGIC_V3
+	KVM_DEV_TYPE_ARM_PMU_V3,
+#define	KVM_DEV_TYPE_ARM_PMU_V3		KVM_DEV_TYPE_ARM_PMU_V3
 	KVM_DEV_TYPE_MAX,
 };
 
diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
index 3ec3cdd..5518308 100644
--- a/virt/kvm/arm/pmu.c
+++ b/virt/kvm/arm/pmu.c
@@ -19,6 +19,7 @@
 #include <linux/kvm.h>
 #include <linux/kvm_host.h>
 #include <linux/perf_event.h>
+#include <linux/uaccess.h>
 #include <asm/kvm_emulate.h>
 #include <kvm/arm_pmu.h>
 #include <kvm/arm_vgic.h>
@@ -374,3 +375,130 @@ void kvm_pmu_set_counter_event_type(struct kvm_vcpu *vcpu, u64 data,
 
 	pmc->perf_event = event;
 }
+
+static inline bool kvm_arm_pmu_initialized(struct kvm_vcpu *vcpu)
+{
+	return vcpu->arch.pmu.irq_num != -1;
+}
+
+static int kvm_arm_pmu_irq_access(struct kvm *kvm, struct kvm_device_attr *attr,
+				  int *irq, bool is_set)
+{
+	int cpuid;
please call this vcpu_id
quoted
+	struct kvm_vcpu *vcpu;
+	struct kvm_pmu *pmu;
+
+	cpuid = attr->attr & KVM_DEV_ARM_PMU_CPUID_MASK;
+	if (cpuid >= atomic_read(&kvm->online_vcpus))
+		return -EINVAL;
+
+	vcpu = kvm_get_vcpu(kvm, cpuid);
+	if (!vcpu)
+		return -EINVAL;
+
+	pmu = &vcpu->arch.pmu;
+	if (!is_set) {
+		if (!kvm_arm_pmu_initialized(vcpu))
+			return -ENODEV;
+
+		*irq = pmu->irq_num;
+	} else {
+		if (kvm_arm_pmu_initialized(vcpu))
+			return -EBUSY;
+
+		kvm_debug("Set kvm ARM PMU irq: %d\n", *irq);
+		pmu->irq_num = *irq;
+	}
+
+	return 0;
+}
+
+static int kvm_arm_pmu_create(struct kvm_device *dev, u32 type)
+{
+	int i;
+	struct kvm_vcpu *vcpu;
+	struct kvm *kvm = dev->kvm;
+
+	kvm_for_each_vcpu(i, vcpu, kvm) {
+		struct kvm_pmu *pmu = &vcpu->arch.pmu;
+
+		memset(pmu, 0, sizeof(*pmu));
I don't think we want this memset. If we can only create the pmu
once, then it's unnecessary (we zalloc vcpus). And, if we can
recreate pmus with this call, then it'll create a memory leak, as
we'll be zero-ing out all the perf_event pointers, and then won't
be able to free them on the call to kvm_pmu_vcpu_reset. Naturally
we need to make sure we're NULL-ing them after each free instead.
Ok, will drop this.
quoted
quoted
+		kvm_pmu_vcpu_reset(vcpu);
+		pmu->irq_num = -1;
+	}
+
+	return 0;
+}
+
+static void kvm_arm_pmu_destroy(struct kvm_device *dev)
+{
+	kfree(dev);
+}
+
+static int kvm_arm_pmu_set_attr(struct kvm_device *dev,
+				struct kvm_device_attr *attr)
+{
+	switch (attr->group) {
+	case KVM_DEV_ARM_PMU_GRP_IRQ: {
+		int __user *uaddr = (int __user *)(long)attr->addr;
+		int reg;
+
+		if (get_user(reg, uaddr))
+			return -EFAULT;
+
+		/*
+		 * The PMU overflow interrupt could be a PPI or SPI, but for one
+		 * VM the interrupt type must be same for each vcpu. As a PPI,
+		 * the interrupt number is same for all vcpus, while as a SPI it
+		 * must be different for each vcpu.
+		 */
+		if (reg < VGIC_NR_SGIS || reg >= dev->kvm->arch.vgic.nr_irqs)
+			return -EINVAL;
+
+		return kvm_arm_pmu_irq_access(dev->kvm, attr, &reg, true);
+	}
+	}
+
+	return -ENXIO;
+}
+
+static int kvm_arm_pmu_get_attr(struct kvm_device *dev,
+				struct kvm_device_attr *attr)
+{
+	int ret;
+
+	switch (attr->group) {
+	case KVM_DEV_ARM_PMU_GRP_IRQ: {
+		int __user *uaddr = (int __user *)(long)attr->addr;
+		int reg = -1;
+
+
+		ret = kvm_arm_pmu_irq_access(dev->kvm, attr, &reg, false);
+		if (ret)
+			return ret;
+		return put_user(reg, uaddr);
+	}
+	}
+
+	return -ENXIO;
+}
+
+static int kvm_arm_pmu_has_attr(struct kvm_device *dev,
+				struct kvm_device_attr *attr)
+{
+	switch (attr->group) {
+	case KVM_DEV_ARM_PMU_GRP_IRQ:
+		return 0;
+	}
+
+	return -ENXIO;
+}
+
+struct kvm_device_ops kvm_arm_pmu_ops = {
+	.name = "kvm-arm-pmu",
+	.create = kvm_arm_pmu_create,
+	.destroy = kvm_arm_pmu_destroy,
+	.set_attr = kvm_arm_pmu_set_attr,
+	.get_attr = kvm_arm_pmu_get_attr,
+	.has_attr = kvm_arm_pmu_has_attr,
+};
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 484079e..81a42cc 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2647,6 +2647,10 @@ static struct kvm_device_ops *kvm_device_ops_table[KVM_DEV_TYPE_MAX] = {
 #ifdef CONFIG_KVM_XICS
 	[KVM_DEV_TYPE_XICS]		= &kvm_xics_ops,
 #endif
+
+#ifdef CONFIG_KVM_ARM_PMU
+	[KVM_DEV_TYPE_ARM_PMU_V3]	= &kvm_arm_pmu_ops,
Shouldn't we specify 'v3' in the kvm_arm_pmu_ops name, as we do with the
device type name?
Sure, will add.
quoted
quoted
+#endif
 };
 
 int kvm_register_device_ops(struct kvm_device_ops *ops, u32 type)
-- 
2.0.4
Thanks,
drew
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help