Re: [PATCH v23 03/15] mm/damon: Adaptively adjust regions

From: SeongJae Park <hidden>
Date: 2021-02-02 09:45:20
Also in: linux-doc, lkml

On Mon, 1 Feb 2021 09:37:33 -0800 Shakeel Butt [off-list ref] wrote:

On Tue, Dec 15, 2020 at 3:57 AM SeongJae Park [off-list ref] wrote:

quoted

From: SeongJae Park <redacted>

Even somehow the initial monitoring target regions are well constructed
to fulfill the assumption (pages in same region have similar access
frequencies), the data access pattern can be dynamically changed.  This
will result in low monitoring quality.  To keep the assumption as much
as possible, DAMON adaptively merges and splits each region based on
their access frequency.

For each ``aggregation interval``, it compares the access frequencies of
adjacent regions and merges those if the frequency difference is small.
Then, after it reports and clears the aggregated access frequency of
each region, it splits each region into two or three regions if the
total number of regions will not exceed the user-specified maximum
number of regions after the split.

Should there be any concerns regarding the number of regions
oscillating even when the access pattern of the application is not
changing? Does the system converge to equilibrium state or does it not
matter?

DAMON will continue splitting regions, but all the changes will be reverted by
merging.  Because callbacks are called after merging finished, this would not
matter.

quoted

In this way, DAMON provides its best-effort quality and minimal overhead
while keeping the upper-bound overhead that users set.

Signed-off-by: SeongJae Park <redacted>
Reviewed-by: Leonard Foerster <redacted>
---
 include/linux/damon.h |  41 +++++---
 mm/damon/core.c       | 220 ++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 240 insertions(+), 21 deletions(-)

diff --git a/include/linux/damon.h b/include/linux/damon.h
index 7d4685adc8a9..f446f8433599 100644
--- a/include/linux/damon.h
+++ b/include/linux/damon.h

@@ -12,6 +12,9 @@
 #include <linux/time64.h>
 #include <linux/types.h>

+/* Minimal region size.  Every damon_region is aligned by this. */
+#define DAMON_MIN_REGION       PAGE_SIZE
+
 /**
  * struct damon_addr_range - Represents an address region of [@start, @end).
  * @start:     Start address of the region (inclusive).

@@ -86,6 +89,8 @@ struct damon_ctx;
  * prepared for the next access check.
  * @check_accesses should check the accesses to each region that made after the
  * last preparation and update the number of observed accesses of each region.
+ * It should also return max number of observed accesses that made as a result
+ * of its update.

Why?

To get the max access count without additional iteration of regions.  The count
will be used to calculate merge/split threshold.  I will add this explanation
in the next version.

The additional iteration would not be a real performance bottleneck for usual
case, so I we could make this optimization later.  However, because making such
optimization with callback interface would cause some backward compatibility
issue, I'd like to do this now.

quoted

  * @reset_aggregated should reset the access monitoring results that aggregated
  * by @check_accesses.
  * @target_valid should check whether the target is still valid for the

[...]

quoted

+unsigned int damon_nr_regions(struct damon_target *t)
+{
+       struct damon_region *r;
+       unsigned int nr_regions = 0;
+
+       damon_for_each_region(r, t)
+               nr_regions++;

Why not just add the region_count filed in damon_target?

Just to make the code simpler.  We can easily optimize in the way if this turns
out to be a real performance bottleneck.

quoted

+
+       return nr_regions;
+}
+
 struct damon_ctx *damon_new_ctx(enum damon_target_type type)
 {
        struct damon_ctx *ctx;

@@ -128,8 +143,12 @@ struct damon_ctx *damon_new_ctx(enum damon_target_type type)
        mutex_init(&ctx->kdamond_lock);

        ctx->target_type = type;
-       if (type != DAMON_ARBITRARY_TARGET)
-               INIT_LIST_HEAD(&ctx->region_targets);
+       if (type != DAMON_ARBITRARY_TARGET) {
+               ctx->min_nr_regions = 10;
+               ctx->max_nr_regions = 1000;

IMO these settings/heuristics should be part of the virtual address
space monitor primitives and not be in the core monitor.

These are just default values.  For the adpative regions adjustment, I think we
agreed on adding it in the core for now.

quoted

+
+               INIT_LIST_HEAD(&ctx->adaptive_targets);
+       }

        return ctx;
 }

[...]

quoted

+
+/*
+ * Split every target region into randomly-sized small regions
+ *
+ * This function splits every target region into random-sized small regions if
+ * current total number of the regions is equal or smaller than half of the
+ * user-specified maximum number of regions.  This is for maximizing the
+ * monitoring accuracy under the dynamically changeable access patterns.  If a
+ * split was unnecessarily made, later 'kdamond_merge_regions()' will revert
+ * it.
+ */
+static void kdamond_split_regions(struct damon_ctx *ctx)
+{
+       struct damon_target *t;
+       unsigned int nr_regions = 0;
+       static unsigned int last_nr_regions;
+       int nr_subregions = 2;
+
+       damon_for_each_target(t, ctx)
+               nr_regions += damon_nr_regions(t);
+
+       if (nr_regions > ctx->max_nr_regions / 2)
+               return;

Shouldn't the limits on region be per-target instead of for the whole context?

I think this makes the monitoring overhead upperbound setting simpler.  If we
need to set per-target monitoring upperbound, we can use multiple contexts for
each target.

quoted

+
+       /* Maybe the middle of the region has different access frequency */
+       if (last_nr_regions == nr_regions &&
+                       nr_regions < ctx->max_nr_regions / 3)
+               nr_subregions = 3;
+
+       damon_for_each_target(t, ctx)
+               damon_split_regions_of(ctx, t, nr_subregions);
+
+       last_nr_regions = nr_regions;
+}
+
 /*
  * Check whether it is time to check and apply the target monitoring regions
  *

@@ -391,6 +588,8 @@ static int kdamond_fn(void *data)
        struct damon_ctx *ctx = (struct damon_ctx *)data;
        struct damon_target *t;
        struct damon_region *r, *next;
+       unsigned int max_nr_accesses = 0;
+       unsigned long sz_limit = 0;

        pr_info("kdamond (%d) starts\n", ctx->kdamond->pid);
@@ -399,6 +598,8 @@ static int kdamond_fn(void *data)
        if (ctx->callback.before_start && ctx->callback.before_start(ctx))
                set_kdamond_stop(ctx);

+       sz_limit = damon_region_sz_limit(ctx);
+
        while (!kdamond_need_stop(ctx)) {
                if (ctx->primitive.prepare_access_checks)
                        ctx->primitive.prepare_access_checks(ctx);
@@ -409,14 +610,20 @@ static int kdamond_fn(void *data)
                usleep_range(ctx->sample_interval, ctx->sample_interval + 1);

                if (ctx->primitive.check_accesses)
-                       ctx->primitive.check_accesses(ctx);
+                       max_nr_accesses = ctx->primitive.check_accesses(ctx);

                if (kdamond_aggregate_interval_passed(ctx)) {
+                       if (ctx->target_type != DAMON_ARBITRARY_TARGET)
+                               kdamond_merge_regions(ctx,
+                                               max_nr_accesses / 10,

What's the reason behind this 10?

It came from my gut feeling and it is still there because it worked well with
my test workloads.  I think we could change that or allow users adjustable if
problematic case is found later.


Thanks,
SeongJae Park
[...]

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help