--- v4
+++ v3
@@ -1,19 +1,16 @@
Power 9 has In-Memory-Collection (IMC) infrastructure which contains
various Performance Monitoring Units (PMUs) at Nest level (these are
-on-chip but off-core), Core level and Thread level.
+on-chip but off-core). These Nest PMU counters are handled by a Nest
+IMC microcode. This microcode runs in the OCC (On-Chip Controller)
+complex and its purpose is to program the nest counters, collect the
+counter data and move the counter data to memory.
-The Nest PMU counters are handled by a Nest IMC microcode which runs
-in the OCC (On-Chip Controller) complex. The microcode collects the
-counter data and moves the nest IMC counter data to memory.
-
-The Core and Thread IMC PMU counters are handled in the core. Core
-level PMU counters give us the IMC counters' data per core and thread
-level PMU counters give us the IMC counters' data per CPU thread.
-
-This patchset enables the nest IMC, core IMC and thread IMC
-PMUs and is based on the initial work done by Madhavan Srinivasan.
-"Nest Instrumentation Support" :
-https://lists.ozlabs.org/pipermail/linuxppc-dev/2015-August/132078.html
+The IMC infrastructure encapsulates nest (per-chip), core and thread
+level counters. While the nest IMC PMUs are handled by the nest IMC
+microcode, the core and thread level PMUs are handled by the Core-HPMC
+engine. This patchset enables the nest IMC PMUs and is based on the
+initial work done by Madhavan Srinivasan.
+"Nest Instrumentation Support" : https://lists.ozlabs.org/pipermail/linuxppc-dev/2015-August/132078.html
v1 for this patchset can be found here :
https://lwn.net/Articles/705475/
@@ -22,17 +19,8 @@
Per-chip nest instrumentation provides various per-chip metrics
such as memory, powerbus, Xlink and Alink bandwidth.
-Core events:
-Per-core IMC instrumentation provides various per-core metrics
-such as non-idle cycles, non-idle instructions, various cache and
-memory related metrics etc.
-
-Thread events:
-All the events for thread level are same as core level with the
-difference being in the domain. These are per-cpu metrics.
-
PMU Events' Information:
-OPAL obtains the IMC PMU and event information from the IMC Catalog
+OPAL obtains the Nest PMU and event information from the IMC Catalog
and passes on to the kernel via the device tree. The events' information
contains :
- Event name
@@ -50,7 +38,7 @@
accumulated.
The OPAL-side patches are posted upstream :
-https://lists.ozlabs.org/pipermail/skiboot/2017-January/005979.html
+https://lists.ozlabs.org/pipermail/skiboot/2016-November/005552.html
The kernel discovers the IMC counters information in the device tree
at the "imc-counters" device node which has a compatible field
@@ -61,100 +49,41 @@
discover the "imc-counters" node and walk through the pmu and event
nodes.
-Here is an excerpt of the dt showing the imc-counters with
-mcs0 (nest), core and thread node:
+Here is an excerpt of the dt showing the imc-counters and mcs node:
/dts-v1/;
[...]
-
-/dts-v1/;
-
-/ {
- name = "";
- compatible = "ibm,opal-in-memory-counters";
- #address-cells = <0x1>;
- #size-cells = <0x1>;
- imc-nest-offset = <0x320000>;
- imc-nest-size = <0x30000>;
- version-id = "";
-
- NEST_MCS: nest-mcs-events {
+ imc-counters {
+ imc-nest-offset = <0x320000>;
+ compatible = "ibm,opal-in-memory-counters";
+ imc-nest-size = <0x30000>;
#address-cells = <0x1>;
#size-cells = <0x1>;
-
- event@0 {
- event-name = "RRTO_QFULL_NO_DISP" ;
- reg = <0x0 0x8>;
- desc = "RRTO not dispatched in MCS0 due to capacity - pulses once for each time a valid RRTO op is not dispatched due to a command list full condition" ;
- };
- event@8 {
- event-name = "WRTO_QFULL_NO_DISP" ;
- reg = <0x8 0x8>;
- desc = "WRTO not dispatched in MCS0 due to capacity - pulses once for each time a valid WRTO op is not dispatched due to a command list full condition" ;
- };
- [...]
- mcs0 {
- compatible = "ibm,imc-counters-nest";
- events-prefix = "PM_MCS0_";
- unit = "";
- scale = "";
- reg = <0x118 0x8>;
- events = < &NEST_MCS >;
- };
-
- mcs1 {
- compatible = "ibm,imc-counters-nest";
- events-prefix = "PM_MCS1_";
- unit = "";
- scale = "";
- reg = <0x198 0x8>;
- events = < &NEST_MCS >;
- };
- [...]
+ phandle = <0x10000238>;
+ version-id = [00];
- CORE_EVENTS: core-events {
- #address-cells = <0x1>;
- #size-cells = <0x1>;
-
- event@e0 {
- event-name = "0THRD_NON_IDLE_PCYC" ;
- reg = <0xe0 0x8>;
- desc = "The number of processor cycles when all threads are idle" ;
- };
- event@120 {
- event-name = "1THRD_NON_IDLE_PCYC" ;
- reg = <0x120 0x8>;
- desc = "The number of processor cycles when exactly one SMT thread is executing non-idle code" ;
- };
- [...]
- core {
- compatible = "ibm,imc-counters-core";
- events-prefix = "CPM_";
- unit = "";
- scale = "";
- reg = <0x0 0x8>;
- events = < &CORE_EVENTS >;
- };
-
- thread {
- compatible = "ibm,imc-counters-core";
- events-prefix = "CPM_";
- unit = "";
- scale = "";
- reg = <0x0 0x8>;
- events = < &CORE_EVENTS >;
- };
-};
+ mcs0 {
+ compatible = "ibm,imc-counters-chip";
+ ranges;
+ #address-cells = <0x1>;
+ #size-cells = <0x1>;
+ phandle = <0x10000279>;
+ scale = "1.2207e-4";
+ unit = "MiB";
+
+ event@528 {
+ event-name = "PM_MCS_UP_128B_DATA_XFER_MC0" ;
+ desc = "Total Read Bandwidth seen on both MCS of MC0";
+ phandle = <0x1000028c>;
+ reg = <0x118 0x8>;
+ };
+[...]
>From the device tree, the kernel parses the PMUs and their events'
information.
-After parsing the IMC PMUs and their events, the PMUs and their
+After parsing the nest IMC PMUs and their events, the PMUs and their
attributes are registered in the kernel.
-
-This patchset (patches 9 and 10) configure the thread level IMC PMUs
-to count for tasks, which give us the thread level metric values per
-task.
Example Usage :
# perf list
@@ -163,31 +92,16 @@
nest_mcs0/PM_MCS_DOWN_128B_DATA_XFER_MC0/ [Kernel PMU event]
nest_mcs0/PM_MCS_DOWN_128B_DATA_XFER_MC0_LAST_SAMPLE/ [Kernel PMU event]
[...]
- core_imc/CPM_NON_IDLE_INST/ [Kernel PMU event]
- core_imc/CPM_NON_IDLE_PCYC/ [Kernel PMU event]
- [...]
- thread_imc/CPM_NON_IDLE_INST/ [Kernel PMU event]
- thread_imc/CPM_NON_IDLE_PCYC/ [Kernel PMU event]
-To see per chip data for nest_mcs0/PM_MCS_DOWN_128B_DATA_XFER_MC0/ :
# perf stat -e "nest_mcs0/PM_MCS_DOWN_128B_DATA_XFER_MC0/" -a --per-socket
-To see non-idle instructions for core 0 :
- # ./perf stat -e "core_imc/CPM_NON_IDLE_INST/" -C 0 -I 1000
-
-To see non-idle instructions for a "make" :
- # ./perf stat -e "thread_imc/CPM_NON_IDLE_PCYC/" make
+TODOs:
+ - Add support for Core IMC.
+ - Add support for thread IMC.
Comments/feedback/suggestions are welcome.
Changelog:
- v3 -> v4 :
- - Changed the events parser code to discover the PMU and events because
- of the changed format of the IMC DTS file (Patch 3).
- - Implemented the two TODOs to include core and thread IMC support with
- this patchset (Patches 7 through 10).
- - Changed the CPU hotplug code of Nest IMC PMUs to include a new state
- CPUHP_AP_PERF_POWERPC_NEST_ONLINE (Patch 6).
v2 -> v3 :
- Changed all references for IMA (In-Memory Accumulation) to IMC (In-Memory
Collection).
@@ -209,33 +123,26 @@
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Stephane Eranian <eranian@google.com>
-Cc: Balbir Singh <bsingharora@gmail.com>
-Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Signed-off-by: Hemant Kumar <hemant@linux.vnet.ibm.com>
-Hemant Kumar (10):
+Hemant Kumar (6):
powerpc/powernv: Data structure and macros definitions
powerpc/powernv: Autoload IMC device driver module
powerpc/powernv: Detect supported IMC units and its events
powerpc/perf: Add event attribute and group to IMC pmus
powerpc/perf: Generic imc pmu event functions
powerpc/perf: IMC pmu cpumask and cpu hotplug support
- powerpc/powernv: Core IMC events detection
- powerpc/perf: PMU functions for Core IMC and hotplugging
- powerpc/powernv: Thread IMC events detection
- powerpc/perf: Thread IMC PMU functions
- arch/powerpc/include/asm/imc-pmu.h | 83 +++
- arch/powerpc/include/asm/opal-api.h | 11 +-
- arch/powerpc/include/asm/opal.h | 5 +
+ arch/powerpc/include/asm/imc-pmu.h | 74 ++++
+ arch/powerpc/include/asm/opal-api.h | 3 +-
+ arch/powerpc/include/asm/opal.h | 2 +
arch/powerpc/perf/Makefile | 6 +-
- arch/powerpc/perf/imc-pmu.c | 775 +++++++++++++++++++++++++
+ arch/powerpc/perf/imc-pmu.c | 383 ++++++++++++++++++++
arch/powerpc/platforms/powernv/Makefile | 2 +-
- arch/powerpc/platforms/powernv/opal-imc.c | 553 ++++++++++++++++++
- arch/powerpc/platforms/powernv/opal-wrappers.S | 2 +
+ arch/powerpc/platforms/powernv/opal-imc.c | 478 +++++++++++++++++++++++++
+ arch/powerpc/platforms/powernv/opal-wrappers.S | 1 +
arch/powerpc/platforms/powernv/opal.c | 13 +
- include/linux/cpuhotplug.h | 2 +
- 10 files changed, 1449 insertions(+), 3 deletions(-)
+ 9 files changed, 959 insertions(+), 3 deletions(-)
create mode 100644 arch/powerpc/include/asm/imc-pmu.h
create mode 100644 arch/powerpc/perf/imc-pmu.c
create mode 100644 arch/powerpc/platforms/powernv/opal-imc.c