Re: [PATCH v7 01/15] sched/core: uclamp: Add CPU's clamp buckets refcounting
From: Suren Baghdasaryan <surenb@google.com>
Date: 2019-03-13 21:01:52
Also in:
linux-pm, lkml
On Wed, Mar 13, 2019 at 8:15 AM Patrick Bellasi [off-list ref] wrote:
On 12-Mar 13:52, Dietmar Eggemann wrote:quoted
On 2/8/19 11:05 AM, Patrick Bellasi wrote: [...]quoted
+config UCLAMP_BUCKETS_COUNT + int "Number of supported utilization clamp buckets" + range 5 20 + default 5 + depends on UCLAMP_TASK + help + Defines the number of clamp buckets to use. The range of each bucket + will be SCHED_CAPACITY_SCALE/UCLAMP_BUCKETS_COUNT. The higher the + number of clamp buckets the finer their granularity and the higher + the precision of clamping aggregation and tracking at run-time. + + For example, with the default configuration we will have 5 clamp + buckets tracking 20% utilization each. A 25% boosted tasks will be + refcounted in the [20..39]% bucket and will set the bucket clamp + effective value to 25%. + If a second 30% boosted task should be co-scheduled on the same CPU, + that task will be refcounted in the same bucket of the first task and + it will boost the bucket clamp effective value to 30%. + The clamp effective value of a bucket is reset to its nominal value + (20% in the example above) when there are anymore tasks refcounted inthis sounds weird.Why ?
Should probably be "when there are no more tasks refcounted"
quoted
[...]quoted
+static inline unsigned int uclamp_bucket_value(unsigned int clamp_value) +{ + return UCLAMP_BUCKET_DELTA * uclamp_bucket_id(clamp_value); +}Soemthing like uclamp_bucket_nominal_value() should be clearer.Maybe... can update it in v8
uclamp_bucket_base_value is a little shorter, just to consider :)
quoted
quoted
+static inline void uclamp_rq_update(struct rq *rq, unsigned int clamp_id) +{ + struct uclamp_bucket *bucket = rq->uclamp[clamp_id].bucket; + unsigned int max_value = uclamp_none(clamp_id); + unsigned int bucket_id;unsigned int bucket_id = UCLAMP_BUCKETS;quoted
+ + /* + * Both min and max clamps are MAX aggregated, thus the topmost + * bucket with some tasks defines the rq's clamp value. + */ + bucket_id = UCLAMP_BUCKETS;to get rid of this line?I put it on a different line as a justfication for the loop variable initialization described in the comment above.quoted
quoted
+ do { + --bucket_id; + if (!rq->uclamp[clamp_id].bucket[bucket_id].tasks)if (!bucket[bucket_id].tasks)Right... that's some leftover from the last refactoring! [...]quoted
quoted
+ * within each bucket the exact "requested" clamp value whenever all tasks + * RUNNABLE in that bucket require the same clamp. + */ +static inline void uclamp_rq_inc_id(struct task_struct *p, struct rq *rq, + unsigned int clamp_id) +{ + unsigned int bucket_id = p->uclamp[clamp_id].bucket_id; + unsigned int rq_clamp, bkt_clamp, tsk_clamp;Wouldn't it be easier to have a pointer to the task's and rq's uclamp structure as well to the bucket? - unsigned int bucket_id = p->uclamp[clamp_id].bucket_id; + struct uclamp_se *uc_se = &p->uclamp[clamp_id]; + struct uclamp_rq *uc_rq = &rq->uclamp[clamp_id]; + struct uclamp_bucket *bucket = &uc_rq->bucket[uc_se->bucket_id];I think I went back/forth a couple of times in using pointer or the extended version, which both have pros and cons. I personally prefer the pointers as you suggest but I've got the impression in the past that since everybody cleared "basic C trainings" it's not so difficult to read the code above too.quoted
The code in uclamp_rq_inc_id() and uclamp_rq_dec_id() for example becomes much more readable.Agree... let's try to switch once again in v8 and see ;)quoted
[...]quoted
struct sched_class { const struct sched_class *next; +#ifdef CONFIG_UCLAMP_TASK + int uclamp_enabled; +#endif + void (*enqueue_task) (struct rq *rq, struct task_struct *p, int flags); void (*dequeue_task) (struct rq *rq, struct task_struct *p, int flags); - void (*yield_task) (struct rq *rq); - bool (*yield_to_task)(struct rq *rq, struct task_struct *p, bool preempt); void (*check_preempt_curr)(struct rq *rq, struct task_struct *p, int flags);@@ -1685,7 +1734,6 @@ struct sched_class { void (*set_curr_task)(struct rq *rq); void (*task_tick)(struct rq *rq, struct task_struct *p, int queued); void (*task_fork)(struct task_struct *p); - void (*task_dead)(struct task_struct *p); /* * The switched_from() call is allowed to drop rq->lock, therefore we@@ -1702,12 +1750,17 @@ struct sched_class { void (*update_curr)(struct rq *rq); + void (*yield_task) (struct rq *rq); + bool (*yield_to_task)(struct rq *rq, struct task_struct *p, bool preempt); + #define TASK_SET_GROUP 0 #define TASK_MOVE_GROUP 1 #ifdef CONFIG_FAIR_GROUP_SCHED void (*task_change_group)(struct task_struct *p, int type); #endif + + void (*task_dead)(struct task_struct *p);Why do you move yield_task, yield_to_task and task_dead here?Since I'm adding a new field at the beginning of the struct, which is used at enqueue/dequeue time, this is to ensure that all the callbacks used in these paths are grouped together and don't fall across a cache line... but yes, that's supposed to be a micro-optimization which I can skip in this patch. -- #include <best/regards.h> Patrick Bellasi