Thread (12 messages) 12 messages, 3 authors, 2021-07-21

Re: [PATCH v8 3/5] arm64: perf: Enable PMU counter userspace access for perf event

From: Mark Rutland <mark.rutland@arm.com>
Date: 2021-06-01 13:55:44
Also in: lkml

On Mon, May 17, 2021 at 02:54:03PM -0500, Rob Herring wrote:
Arm PMUs can support direct userspace access of counters which allows for
low overhead (i.e. no syscall) self-monitoring of tasks. The same feature
exists on x86 called 'rdpmc'. Unlike x86, userspace access will only be
enabled for thread bound events. This could be extended if needed, but
simplifies the implementation and reduces the chances for any
information leaks (which the x86 implementation suffers from).

When an event is capable of userspace access and has been mmapped, userspace
access is enabled when the event is scheduled on a CPU's PMU. There's some
additional overhead clearing counters when disabled in order to prevent
leaking disabled counter data from other tasks.

Unlike x86, enabling of userspace access must be requested with a new
attr bit: config1:1. If the user requests userspace access and 64-bit
counters, then chaining will be disabled and the user will get the
maximum size counter the underlying h/w can support. The modes for
config1 are as follows:

config1 = 0 : user access disabled and always 32-bit
config1 = 1 : user access disabled and always 64-bit (using chaining if needed)
config1 = 2 : user access enabled and always 32-bit
config1 = 3 : user access enabled and counter size matches underlying counter.

Based on work by Raphael Gault [off-list ref], but has been
completely re-written.

Signed-off-by: Rob Herring <robh@kernel.org>
[...]
+static void armv8pmu_enable_user_access(struct arm_pmu *cpu_pmu)
+{
+	struct pmu_hw_events *cpuc = this_cpu_ptr(cpu_pmu->hw_events);
+
+	if (!bitmap_empty(cpuc->dirty_mask, ARMPMU_MAX_HWEVENTS)) {
+		int i;
+		/* Don't need to clear assigned counters. */
+		bitmap_xor(cpuc->dirty_mask, cpuc->dirty_mask, cpuc->used_mask, ARMPMU_MAX_HWEVENTS);
+
+		for_each_set_bit(i, cpuc->dirty_mask, ARMPMU_MAX_HWEVENTS) {
+			if (i == ARMV8_IDX_CYCLE_COUNTER)
+				write_sysreg(0, pmccntr_el0);
+			else
+				armv8pmu_write_evcntr(i, 0);
+		}
+		bitmap_zero(cpuc->dirty_mask, ARMPMU_MAX_HWEVENTS);
+	}
+
+	write_sysreg(ARMV8_PMU_USERENR_ER | ARMV8_PMU_USERENR_CR, pmuserenr_el0);
+}
This still leaks the values of CPU-bound events, or task-bound events
owned by others, right?

[...]
+static void armv8pmu_event_mapped(struct perf_event *event, struct mm_struct *mm)
+{
+	if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR) || (atomic_read(&event->mmap_count) != 1))
+		return;
+
+	if (atomic_inc_return(&event->ctx->nr_user) == 1) {
+		unsigned long flags;
+		atomic_inc(&event->pmu->sched_cb_usage);
+		local_irq_save(flags);
+		armv8pmu_enable_user_access(to_arm_pmu(event->pmu));
+		local_irq_restore(flags);
+	}
+}
+
+static void armv8pmu_event_unmapped(struct perf_event *event, struct mm_struct *mm)
+{
+	if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR) || (atomic_read(&event->mmap_count) != 1))
+		return;
+
+	if (atomic_dec_and_test(&event->ctx->nr_user)) {
+		atomic_dec(&event->pmu->sched_cb_usage);
+		armv8pmu_disable_user_access();
+	}
 }
We can open an event for task A, but call mmap()/munmap() for that event
from task B, which will do the enable/disable on task B rather than task
A. The core doesn't enforce that the mmap is performed on the same core,
so I don't think this is quite right, unfortunately.

I reckon we need to do something with task_function_call() to make this
happen in the context of the expected task.

Thanks,
Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help