Re: [RFC PATCH 06/16] arm: topology: Define TC2 sched energy and provide it to scheduler
From: Jacob Pan <hidden>
Date: 2014-06-06 16:28:19
Also in:
lkml
On Fri, 6 Jun 2014 08:35:21 +0800 Yuyang Du [off-list ref] wrote:
On Fri, Jun 06, 2014 at 10:05:43AM +0200, Peter Zijlstra wrote:quoted
On Fri, Jun 06, 2014 at 04:29:30AM +0800, Yuyang Du wrote:quoted
On Thu, Jun 05, 2014 at 08:03:15AM -0700, Dirk Brandewie wrote:quoted
You can request a P state per core but the package does coordination at a package level for the P state that will be used based on all requests. This is due to the fact that most SKUs have a single VR and PLL. So the highest P state wins. When a core goes idle it loses it's vote for the current package P state and that cores clock it turned off.You need to differentiate Turbo and non-Turbo. The highest P state wins? Not really.*sigh* and here we go again.. someone please, write something coherent and have all intel people sign off on it and stop saying different things.quoted
Actually, silicon supports indepdent non-Turbo pstate, but just not enabled.Then it doesn't exist, so no point in mentioning it.Well, things actually get more complicated. Not-enabled is for Core. For Atom Baytrail, each core indeed can operate on difference frequency. I am not sure for Xeon, :)quoted
quoted
For Turbo, it basically depends on power budget of both core and gfx (because they share) for each core to get which Turbo point.And RAPL controls can give preference of which gfx/core gets most, right?
There are two controls can influence gfx and core power budge sharing: 1. set power limit on each RAPL domain 2. turbo power budge sharing #2 is not implemented yet. default to CPU take all.
quoted
quoted
quoted
intel_pstate tries to keep the core P state as low as possible to satisfy the given load, so when various cores go idle the package P state can be as low as possible. The big power win is a core going idle.In terms of prediction, it is definitely can't be 100% right. But the performance of most workloads does scale with pstate (frequency), may not be linearly. So it is to some point predictable FWIW. And this is all governors and Intel_pstate's basic assumption.So frequency isn't _that_ interesting, voltage is. And while predictability it might be their assumption, is it actually true? I mean, there's really nothing else except to assume that, if its not you can't do anything at all, so you _have_ to assume this. But again, is the assumption true? Or just happy thoughts in an attempt to do something.Voltage is combined with frequency, roughly, voltage is proportional to freuquecy, so roughly, power is proportionaly to voltage^3. You can't say which is more important, or there is no reason to raise voltage without raising frequency. If only one word to say: true of false, it is true. Because given any fixed workload, I can't see why performance would be worse if frequency is higher. The reality as opposed to the assumption is in two-fold: 1) if workload is CPU bound, performance scales with frequency absolutely. if workload is memory bound, it does not scale. But from kernel, we don't know whether it is CPU bound or not (or it is hard to know). uArch statistics can model that. 2) the workload is not fixed in real-time, changing all the time. But still, the assumption is a must or no guilty, because we adjust frequency continuously, for example, if the workload is fixed, and if the performance does not scale with freq we stop increasing frequency. So a good frequency governor or driver should and can continuously pursue "good" frequency with the changing workload. Therefore, in the long term, we will be better off.
[Jacob Pan]