Thread (89 messages) 89 messages, 6 authors, 2019-01-25

Re: [PATCH v6 04/16] sched/core: uclamp: Add CPU's clamp buckets refcounting

From: Patrick Bellasi <hidden>
Date: 2019-01-22 10:31:15
Also in: linux-pm, lkml

On 22-Jan 10:45, Peter Zijlstra wrote:
On Mon, Jan 21, 2019 at 04:33:38PM +0000, Patrick Bellasi wrote:
quoted
On 21-Jan 17:12, Peter Zijlstra wrote:
quoted
On Mon, Jan 21, 2019 at 03:23:11PM +0000, Patrick Bellasi wrote:
quoted
quoted
quoted
and keep all
the buckets in use at the beginning of a cache line.
That; is that the rationale for all this? Note that per the defaults
everything is in a single line already.
Yes, that's because of the loop in:

   dequeue_task()
     uclamp_cpu_dec()
       uclamp_cpu_dec_id()
         uclamp_cpu_update()

where buckets needs sometimes to be scanned to find a new max.

Consider also that, with mapping, we can more easily increase the
buckets count to 20 in order to have a finer clamping granularity if
needed without warring too much about performance impact especially
when we use anyway few different clamp values.

So, I agree that mapping adds (code) complexity but it can also save
few cycles in the fast path... do you think it's not worth the added
complexity?
Then maybe split this out in a separate patch? Do the trivial linear
bucket thing first and then do this smarty pants thing on top.

One problem with the scheme is that it doesn't defrag; so if you get a
peak usage, you can still end up with only two active buckets in
different lines.
You right, that was saved for a later optimization. :/

Mainly in consideration of the fact that, at least for the main usage
we have in mind on Android, we will likely configure all the required
clamps once for all at boot time.
Also; if it is it's own patch, you get a much better view of the
additional complexity and a chance to justify it ;-)
What about ditching the mapping for the time being and see if we
get a real overhead hit in the future ?

At that point we will revamp the mapping patch with also a proper
defrag support.
Also; would it make sense to do s/cpu/rq/ on much of this? All this
uclamp_cpu_*() stuff really is per rq and takes rq arguments, so why
does it have cpu in the name... no strong feelings, just noticed it and
thought is a tad inconsistent.
The idea behind using "cpu" instead of "rq" was that we use those only at
root rq level and the clamps are aggregated per-CPU.

I remember one of the first versions used "cpu" instead of "rq" as a
parameter name and you proposed to change it as an optimization since
we call it from dequeue_task() where we already have a *rq.

... but, since we have those uclamp data within struct rq, I think you
are right: it makes more sense to rename the functions.

Will do it in v7, thanks.

-- 
#include <best/regards.h>

Patrick Bellasi
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help