Thread (4 messages) 4 messages, 3 authors, 2019-10-25

RE: [PATCH v2] x86/hyper-v: micro-optimize send_ipi_one case

From: Vitaly Kuznetsov <vkuznets@redhat.com>
Date: 2019-10-25 17:26:10
Also in: lkml

Michael Kelley [off-list ref] writes:
From: Vitaly Kuznetsov <vkuznets@redhat.com>
quoted
When sending an IPI to a single CPU there is no need to deal with cpumasks.
With 2 CPU guest on WS2019 I'm seeing a minor (like 3%, 8043 -> 7761 CPU
cycles) improvement with smp_call_function_single() loop benchmark. The
optimization, however, is tiny and straitforward. Also, send_ipi_one() is
important for PV spinlock kick.

I was also wondering if it would make sense to switch to using regular
APIC IPI send for CPU > 64 case but no, it is twice as expesive (12650 CPU
cycles for __send_ipi_mask_ex() call, 26000 for orig_apic.send_IPI(cpu,
vector)).

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
Changes since v1:
 - Style changes [Roman, Joe]
---
 arch/x86/hyperv/hv_apic.c           | 13 ++++++++++---
 arch/x86/include/asm/trace/hyperv.h | 15 +++++++++++++++
 2 files changed, 25 insertions(+), 3 deletions(-)
diff --git a/arch/x86/hyperv/hv_apic.c b/arch/x86/hyperv/hv_apic.c
index e01078e93dd3..fd17c6341737 100644
--- a/arch/x86/hyperv/hv_apic.c
+++ b/arch/x86/hyperv/hv_apic.c
@@ -194,10 +194,17 @@ static bool __send_ipi_mask(const struct cpumask *mask, int
vector)

 static bool __send_ipi_one(int cpu, int vector)
 {
-	struct cpumask mask = CPU_MASK_NONE;
+	trace_hyperv_send_ipi_one(cpu, vector);

-	cpumask_set_cpu(cpu, &mask);
-	return __send_ipi_mask(&mask, vector);
+	if (!hv_hypercall_pg || (vector < HV_IPI_LOW_VECTOR) ||
+	    (vector > HV_IPI_HIGH_VECTOR))
+		return false;
+
+	if (cpu >= 64)
+		return __send_ipi_mask_ex(cpumask_of(cpu), vector);
The above test should be checking the VP number, not the CPU
number,
Oops, of course, thanks for catching this! v3 is coming!
 since the VP number is used to form the bitmap argument
to the hypercall.  In all current implementations of Hyper-V, the CPU number
and VP number are the same as far as I am aware, but that's not guaranteed in 
the future.

Michael
quoted
+
+	return !hv_do_fast_hypercall16(HVCALL_SEND_IPI, vector,
+			       BIT_ULL(hv_cpu_number_to_vp_number(cpu)));
 }
-- 
Vitaly
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help