Re: [PATCH v3 1/3] sched/fair: add util_est on top of PELT
From: Patrick Bellasi <hidden>
Date: 2018-02-05 17:49:24
Also in:
lkml
On 30-Jan 15:01, Peter Zijlstra wrote:
On Tue, Jan 30, 2018 at 02:04:32PM +0100, Peter Zijlstra wrote:quoted
On Tue, Jan 30, 2018 at 12:46:33PM +0000, Patrick Bellasi wrote:quoted
quoted
Aside from that being whitespace challenged, did you also try: if ((unsigned)((util_est - util_last) + LIM - 1) < (2 * LIM - 1))No, since the above code IMO is so much "easy to parse for humans" :)Heh, true. Although that's fixable by wrapping it in some helper with a comment.quoted
But, mainly because since the cache alignment update, also while testing on a "big" Intel machine I cannot see regressions on hackbench. This is the code I get on my Xeon E5-2690 v2: if (abs(util_est - util_last) <= (SCHED_CAPACITY_SCALE / 100)) 6ba0: 8b 86 7c 02 00 00 mov 0x27c(%rsi),%eax 6ba6: 48 29 c8 sub %rcx,%rax 6ba9: 48 99 cqto 6bab: 48 31 d0 xor %rdx,%rax 6bae: 48 29 d0 sub %rdx,%rax 6bb1: 48 83 f8 0a cmp $0xa,%rax 6bb5: 7e 1d jle 6bd4 <dequeue_task_fair+0x7e4> Does it look so bad?Its not terrible, and I think your GCC is far more clever than the one ITo clarify; my GCC at the time generated conditional branches to compute the absolute value; and in that case the thing I proposed wins hands down because its unconditional. However the above is also unconditional and then the difference is much less important.
I've finally convinced myself that we can live with the "parsing complexity" of your proposal... and wrapped into an inline it turned out to be not so bad.
quoted
used at the time. But that's 4 dependent instructions (cqto,xor,sub,cmp) whereas the one I proposed uses only 2 (add,cmp).
The ARM64 generated code is also simpler.
quoted
Now, my proposal is, as you say, somewhat hard to read, and it also doesn't work right when our values are 'big' (which they will not be in our case, because util has a very definite bound), and I suspect you're right that ~2 cycles here will not be measurable.
Indeed, I cannot see noticeable differences if not just a slightly improvement...
quoted
So yeah.... whatever ;-)
... I'm going to post a v4 using your proposal ;-) Thanks Patrick -- #include <best/regards.h> Patrick Bellasi