Thread (44 messages) 44 messages, 8 authors, 2018-06-07

Re: [PATCH 6/7] psi: pressure stall information for CPU, memory, and IO

From: Randy Dunlap <hidden>
Date: 2018-05-08 00:43:22
Also in: linux-mm, lkml

On 05/07/2018 02:01 PM, Johannes Weiner wrote:
quoted hunk ↗ jump to hunk
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 Documentation/accounting/psi.txt |  73 ++++++
 include/linux/psi.h              |  27 ++
 include/linux/psi_types.h        |  84 ++++++
 include/linux/sched.h            |  10 +
 include/linux/sched/stat.h       |  10 +-
 init/Kconfig                     |  16 ++
 kernel/fork.c                    |   4 +
 kernel/sched/Makefile            |   1 +
 kernel/sched/core.c              |   3 +
 kernel/sched/psi.c               | 424 +++++++++++++++++++++++++++++++
 kernel/sched/sched.h             | 166 ++++++------
 kernel/sched/stats.h             |  91 ++++++-
 mm/compaction.c                  |   5 +
 mm/filemap.c                     |  15 +-
 mm/page_alloc.c                  |  10 +
 mm/vmscan.c                      |  13 +
 16 files changed, 859 insertions(+), 93 deletions(-)
 create mode 100644 Documentation/accounting/psi.txt
 create mode 100644 include/linux/psi.h
 create mode 100644 include/linux/psi_types.h
 create mode 100644 kernel/sched/psi.c
diff --git a/Documentation/accounting/psi.txt b/Documentation/accounting/psi.txt
new file mode 100644
index 000000000000..e051810d5127
--- /dev/null
+++ b/Documentation/accounting/psi.txt
@@ -0,0 +1,73 @@
Looks good to me.

quoted hunk ↗ jump to hunk
diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
new file mode 100644
index 000000000000..052c529a053b
--- /dev/null
+++ b/kernel/sched/psi.c
@@ -0,0 +1,424 @@
+/*
+ * Measure workload productivity impact from overcommitting CPU, memory, IO
+ *
+ * Copyright (c) 2017 Facebook, Inc.
+ * Author: Johannes Weiner <hannes@cmpxchg.org>
+ *
+ * Implementation
+ *
+ * Task states -- running, iowait, memstall -- are tracked through the
+ * scheduler and aggregated into a system-wide productivity state. The
+ * ratio between the times spent in productive states and delays tells
+ * us the overall productivity of the workload.
+ *
+ * The ratio is tracked in decaying time averages over 10s, 1m, 5m
+ * windows. Cumluative stall times are tracked and exported as well to
               Cumulative
+ * allow detection of latency spikes and custom time averaging.
+ *
+ * Multiple CPUs
+ *
+ * To avoid cache contention, times are tracked local to the CPUs. To
+ * get a comprehensive view of a system or cgroup, we have to consider
+ * the fact that CPUs could be unevenly loaded or even entirely idle
+ * if the workload doesn't have enough threads. To avoid artifacts
+ * caused by that, when adding up the global pressure ratio, the
+ * CPU-local ratios are weighed according to their non-idle time:
+ *
+ *   Time the CPU had stalled tasks    Time the CPU was non-idle
+ *   ------------------------------ * ---------------------------
+ *                Walltime            Time all CPUs were non-idle
+ */
+
+/**
+ * psi_memstall_leave - mark the end of an memory stall section
                                    end of a memory
+ * @flags: flags to handle nested memdelay sections
+ *
+ * Marks the calling task as no longer stalled due to lack of memory.
+ */
+void psi_memstall_leave(unsigned long *flags)
+{


-- 
~Randy
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help