Re: [PATCH 12/19] perf: Ignore event state for group validation

[PATCH 00/19] perf: Rework event_init checks · Robin Murphy <robin.murphy@arm.com> · 2025-08-13
[PATCH 01/19] perf/arm-cmn: Fix event validation · Robin Murphy <robin.murphy@arm.com> · 2025-08-13
Re: [PATCH 01/19] perf/arm-cmn: Fix event validation · Mark Rutland <mark.rutland@arm.com> · 2025-08-26
[PATCH 02/19] perf/hisilicon: Fix group validation · Robin Murphy <robin.murphy@arm.com> · 2025-08-13
Re: [PATCH 02/19] perf/hisilicon: Fix group validation · Mark Rutland <mark.rutland@arm.com> · 2025-08-26
Re: [PATCH 02/19] perf/hisilicon: Fix group validation · Mark Rutland <mark.rutland@arm.com> · 2025-08-26
Re: [PATCH 02/19] perf/hisilicon: Fix group validation · Robin Murphy <robin.murphy@arm.com> · 2025-08-26
Re: [PATCH 02/19] perf/hisilicon: Fix group validation · Mark Rutland <mark.rutland@arm.com> · 2025-08-26
Re: [PATCH 02/19] perf/hisilicon: Fix group validation · Mark Rutland <mark.rutland@arm.com> · 2025-08-26
Re: [PATCH 02/19] perf/hisilicon: Fix group validation · Mark Rutland <mark.rutland@arm.com> · 2025-08-27
[PATCH 03/19] perf/imx8_ddr: Fix group validation · Robin Murphy <robin.murphy@arm.com> · 2025-08-13
[PATCH 04/19] perf/starfive: Fix group validation · Robin Murphy <robin.murphy@arm.com> · 2025-08-13
[PATCH 05/19] iommu/vt-d: Fix perfmon group validation · Robin Murphy <robin.murphy@arm.com> · 2025-08-13
[PATCH 06/19] ARM: l2x0: Fix group validation · Robin Murphy <robin.murphy@arm.com> · 2025-08-13
[PATCH 07/19] ARM: imx: Fix MMDC PMU group validation · Robin Murphy <robin.murphy@arm.com> · 2025-08-13
[PATCH 08/19] perf/arm_smmu_v3: Improve group validation · Robin Murphy <robin.murphy@arm.com> · 2025-08-13
[PATCH 09/19] perf/qcom: Improve group validation · Robin Murphy <robin.murphy@arm.com> · 2025-08-13
[PATCH 10/19] perf/arm-ni: Improve event validation · Robin Murphy <robin.murphy@arm.com> · 2025-08-13
[PATCH 11/19] perf/arm-cci: Tidy up event validation · Robin Murphy <robin.murphy@arm.com> · 2025-08-13
[PATCH 12/19] perf: Ignore event state for group validation · Robin Murphy <robin.murphy@arm.com> · 2025-08-13
Re: [PATCH 12/19] perf: Ignore event state for group validation · Peter Zijlstra <peterz@infradead.org> · 2025-08-26
Re: [PATCH 12/19] perf: Ignore event state for group validation · Robin Murphy <robin.murphy@arm.com> · 2025-08-26
Re: [PATCH 12/19] perf: Ignore event state for group validation · Ian Rogers <irogers@google.com> · 2025-08-26
Re: [PATCH 12/19] perf: Ignore event state for group validation · Mark Rutland <mark.rutland@arm.com> · 2025-08-27
Re: [PATCH 12/19] perf: Ignore event state for group validation · Ian Rogers <irogers@google.com> · 2025-08-27
[PATCH 13/19] perf: Add helper for checking grouped events · Robin Murphy <robin.murphy@arm.com> · 2025-08-13
Re: [PATCH 13/19] perf: Add helper for checking grouped events · kernel test robot <hidden> · 2025-08-14
[PATCH 14/19] perf: Clean up redundant group validation · Robin Murphy <robin.murphy@arm.com> · 2025-08-13
[PATCH 15/19] perf: Simplify group validation · Robin Murphy <robin.murphy@arm.com> · 2025-08-13
[PATCH 16/19] perf: Introduce positive capability for sampling · Robin Murphy <robin.murphy@arm.com> · 2025-08-13
Re: [PATCH 16/19] perf: Introduce positive capability for sampling · Peter Zijlstra <peterz@infradead.org> · 2025-08-26
Re: [PATCH 16/19] perf: Introduce positive capability for sampling · Mark Rutland <mark.rutland@arm.com> · 2025-08-26
Re: [PATCH 16/19] perf: Introduce positive capability for sampling · Robin Murphy <robin.murphy@arm.com> · 2025-08-26
Re: [PATCH 16/19] perf: Introduce positive capability for sampling · Leo Yan <leo.yan@arm.com> · 2025-08-26
Re: [PATCH 16/19] perf: Introduce positive capability for sampling · Robin Murphy <robin.murphy@arm.com> · 2025-08-26
Re: [PATCH 16/19] perf: Introduce positive capability for sampling · Leo Yan <leo.yan@arm.com> · 2025-08-27
[PATCH 17/19] perf: Retire PERF_PMU_CAP_NO_INTERRUPT · Robin Murphy <robin.murphy@arm.com> · 2025-08-13
Re: [PATCH 17/19] perf: Retire PERF_PMU_CAP_NO_INTERRUPT · Peter Zijlstra <peterz@infradead.org> · 2025-08-26
[PATCH 18/19] perf: Introduce positive capability for raw events · Robin Murphy <robin.murphy@arm.com> · 2025-08-13
Re: [PATCH 18/19] perf: Introduce positive capability for raw events · Robin Murphy <robin.murphy@arm.com> · 2025-08-19
Re: [PATCH 18/19] perf: Introduce positive capability for raw events · Thomas Richter <hidden> · 2025-08-20
Re: [PATCH 18/19] perf: Introduce positive capability for raw events · Robin Murphy <robin.murphy@arm.com> · 2025-08-20
Re: [PATCH 18/19] perf: Introduce positive capability for raw events · kernel test robot <hidden> · 2025-08-21
Re: [PATCH 18/19] perf: Introduce positive capability for raw events · Mark Rutland <mark.rutland@arm.com> · 2025-08-26
Re: [PATCH 18/19] perf: Introduce positive capability for raw events · Robin Murphy <robin.murphy@arm.com> · 2025-08-26
Re: [PATCH 18/19] perf: Introduce positive capability for raw events · Mark Rutland <mark.rutland@arm.com> · 2025-08-27
Re: [PATCH 18/19] perf: Introduce positive capability for raw events · Thomas Richter <hidden> · 2025-08-27
[PATCH 19/19] perf: Garbage-collect event_init checks · Robin Murphy <robin.murphy@arm.com> · 2025-08-13
Re: [PATCH 19/19] perf: Garbage-collect event_init checks · kernel test robot <hidden> · 2025-08-14
Re: [PATCH 19/19] perf: Garbage-collect event_init checks · kernel test robot <hidden> · 2025-08-19
Re: [PATCH 19/19] perf: Garbage-collect event_init checks · Robin Murphy <robin.murphy@arm.com> · 2025-08-19
Re: [PATCH 19/19] perf: Garbage-collect event_init checks · Robin Murphy <robin.murphy@arm.com> · 2025-08-19

From: Ian Rogers <irogers@google.com>
Date: 2025-08-26 18:49:01
Also in: amd-gfx, dmaengine, dri-devel, imx, intel-gfx, intel-xe, linux-alpha, linux-amlogic, linux-arm-kernel, linux-arm-msm, linux-cxl, linux-fpga, linux-iommu, linux-mips, linux-perf-users, linux-pm, linux-riscv, linux-rockchip, linux-s390, linux-sh, lkml, loongarch, sparclinux

On Tue, Aug 26, 2025 at 8:32 AM Robin Murphy [off-list ref] wrote:

On 2025-08-26 2:03 pm, Peter Zijlstra wrote:

quoted

On Wed, Aug 13, 2025 at 06:01:04PM +0100, Robin Murphy wrote:

quoted

It may have been different long ago, but today it seems wrong for these
drivers to skip counting disabled sibling events in group validation,
given that perf_event_enable() could make them schedulable again, and
thus increase the effective size of the group later. Conversely, if a
sibling event is truly dead then it stands to reason that the whole
group is dead, so it's not worth going to any special effort to try to
squeeze in a new event that's never going to run anyway. Thus, we can
simply remove all these checks.

So currently you can do sort of a manual event rotation inside an
over-sized group and have it work.

I'm not sure if anybody actually does this, but its possible.

Eg. on a PMU that supports only 4 counters, create a group of 5 and
periodically cycle which of the 5 events is off.

I'm not sure this is true, I thought this would fail in the
perf_event_open when adding the 5th event and there being insufficient
counters for the group. Not all PMUs validate a group will fit on the
counters, but I thought at least Intel's core PMU would validate and
not allow this. Fwiw, the metric code is reliant on this behavior as
by default all events are placed into a weak group:
https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/metricgroup.c?h=perf-tools-next#n631
Weak groups are really just groups that when the perf_event_open fails
retry with the grouping removed. PMUs that don't fail the
perf_event_open are problematic as the reads just report "not counted"
and the metric doesn't work. Sometimes the PMU can't help it due to
errata. There are a bunch of workarounds for those cases carried in
the perf tool, but in general weak groups working is relied upon:
https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/pmu-events/pmu-events.h?h=perf-tools-next#n16

Thanks,
Ian

quoted

So I'm not against changing this, but changing stuff like this always
makes me a little fearful -- it wouldn't be the first time that when it
finally trickles down to some 'enterprise' user in 5 years someone comes
and finally says, oh hey, you broke my shit :-(

Eww, I see what you mean... and I guess that's probably lower-overhead
than actually deleting and recreating the sibling event(s) each time,
and potentially less bother then wrangling multiple groups for different
combinations of subsets when one simply must still approximate a complex
metric that requires more counters than the hardware offers.

I'm also not keen to break anything that wasn't already somewhat broken,
especially since this patch is only intended as cleanup, so either we
could just drop it altogether, or perhaps I can wrap the existing
behaviour in a helper that can at least document this assumption and
discourage new drivers from copying it. Am I right that only
PERF_EVENT_STATE_{OFF,ERROR} would matter for this, though, and my
reasoning for state <= PERF_EVENT_STATE_EXIT should still stand? As for
the fiddly discrepancy with enable_on_exec between arm_pmu and others
I'm not really sure what to think...

Thanks,
Robin.

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help