Re: [PATCH v4] cpuidle: Fix last_residency division
From: Balbir Singh <bsingharora@gmail.com>
Date: 2016-07-01 12:41:13
Also in:
linux-pm
On Fri, 2016-07-01 at 10:06 +0200, Daniel Lezcano wrote:
On 06/30/2016 05:37 PM, Nicolas Pitre wrote:quoted
On Thu, 30 Jun 2016, Daniel Lezcano wrote:[ ... ]quoted
quoted
quoted
+ if (likely(nsec < DIV_APPROXIMATION_THRESHOLD)) { + u32 usec = nsec; + + usec += usec >> 5; + usec = usec >> 10; + + /* Can safely cast to int since usec is < INT_MAX */ + return usec; + } else { + u64 usec = div_u64(nsec, 1000); + + if (usec > INT_MAX) + usec = INT_MAX; + + /* Can safely cast to int since usec is < INT_MAX */ + return usec; + } +}What bothers me with this division is the benefit of adding an extra ultra optimized division by 1000 in cpuidle.h while we have already ktime_divns which is optimized in ktime.h.It is "optimized" but still much heavier than what is presented above as it provides maximum precision. It all depends on how important the performance gain from the original shift by 10 was in the first place.Actually the original shift was there because it was convenient as a simple ~div1000 operation. But against all odds, the approximation introduced a regression on a very specific use case on PowerPC. We are not in the hot path and I think we can live with a ktime_divns without problem. That would simplify the fix I believe.
I would tend to agree with this and there are better ways to do multiplicative inverses if we care Balbir Singh.