Re: [PATCH v4 10/14] cpuidle: psci: Prepare to use OS initiated suspend mode... | linux-arm-kernel

quoted

On Fri, 20 Dec 2019 at 11:01, Sudeep Holla [off-list ref] wrote:
On Thu, Dec 19, 2019 at 10:33:34PM +0100, Ulf Hansson wrote:
On Thu, 19 Dec 2019 at 19:01, Sudeep Holla [off-list ref] wrote:
On Thu, Dec 19, 2019 at 04:48:13PM +0100, Ulf Hansson wrote:
On Thu, 19 Dec 2019 at 15:32, Sudeep Holla [off-list ref] wrote:
On Wed, Dec 11, 2019 at 04:43:39PM +0100, Ulf Hansson wrote:
The per CPU variable psci_power_state, contains an array of fixed values,
which reflects the corresponding arm,psci-suspend-param parsed from DT, for
each of the available CPU idle states.

This isn't sufficient when using the hierarchical CPU topology in DT, in
combination with having PSCI OS initiated (OSI) mode enabled. More
precisely, in OSI mode, Linux is responsible of telling the PSCI FW what
idle state the cluster (a group of CPUs) should enter, while in PSCI
Platform Coordinated (PC) mode, each CPU independently votes for an idle
state of the cluster.

For this reason, introduce a per CPU variable called domain_state and
implement two helper functions to read/write its value. Then let the
domain_state take precedence over the regular selected state, when entering
and idle state.

To avoid executing the above OSI specific code in the ->enter() callback,
while operating in the default PSCI Platform Coordinated mode, let's also
add a new enter-function and use it for OSI.

Co-developed-by: Lina Iyer <redacted>
Signed-off-by: Lina Iyer <redacted>
Signed-off-by: Ulf Hansson <redacted>
---

Changes in v4:
      - Rebased on top of earlier changes.
      - Add comment about using the deepest cpuidle state for the domain state
      selection.

---
 drivers/cpuidle/cpuidle-psci.c | 56 ++++++++++++++++++++++++++++++----
 1 file changed, 50 insertions(+), 6 deletions(-)

diff --git a/drivers/cpuidle/cpuidle-psci.c b/drivers/cpuidle/cpuidle-psci.c
index 6a87848be3c3..9600fe674a89 100644
--- a/drivers/cpuidle/cpuidle-psci.c
+++ b/drivers/cpuidle/cpuidle-psci.c
@@ -29,14 +29,47 @@ struct psci_cpuidle_data {
 };

 static DEFINE_PER_CPU_READ_MOSTLY(struct psci_cpuidle_data, psci_cpuidle_data);
+static DEFINE_PER_CPU(u32, domain_state);
+
[...]

+static int psci_enter_domain_idle_state(struct cpuidle_device *dev,
+                                     struct cpuidle_driver *drv, int idx)
+{
+     struct psci_cpuidle_data *data = this_cpu_ptr(&psci_cpuidle_data);
+     u32 *states = data->psci_states;
Why can't the above be like this for consistency(see below in
psci_enter_idle_state) ?
You have a point, however in patch11 I am adding this line below.

struct device *pd_dev = data->dev;

So I don't think it matters much, agree?
Ah OK, looked odd as part of this patch, may be you could have moved
this change into that patch. Anyways fine as is.
Okay, then I rather just keep it.


        u32 *states = __this_cpu_read(psci_cpuidle_data.psci_states);

+     u32 state = psci_get_domain_state();
+     int ret;
+
+     if (!state)
+             state = states[idx];
+
+     ret = psci_enter_state(idx, state);
+
+     /* Clear the domain state to start fresh when back from idle. */
+     psci_set_domain_state(0);
+     return ret;
+}
[...]

@@ -118,6 +152,15 @@ static int __init psci_dt_cpu_init_idle(struct device_node *cpu_node,
                      ret = PTR_ERR(data->dev);
                      goto free_mem;
              }
+
+             /*
+              * Using the deepest state for the CPU to trigger a potential
+              * selection of a shared state for the domain, assumes the
+              * domain states are all deeper states.
+              */
+             if (data->dev)
You can drop this check as return on error above.
Actually not, because if OSI is supported, there is still a
possibility that the PM domain topology isn't used.
And how do we support that ? I am missing something here.

This means ->data->dev is NULL.
I don't get that.
This is quite similar to the existing limited support we have for OSI today.

We are using the idle states for the CPU, but ignoring the idle states
for the cluster. If you just skip applying the DTS patch14, this is
what happens.
No if psci_set_osi fails, we shouldn't create genpd domain as we don't
enter any cluster state. The default mode(same as PC) should work which
don't need any genpd domains. Adding one which is unused is just confusion.
Please avoid that.
I am deferring to the other thread to continue this discussion.




+                     drv->states[state_count - 1].enter =
+                             psci_enter_domain_idle_state;
I see the comment above but this potential blocks retention mode at
cluster level when all cpu enter retention at CPU level. I don't like
this assumption, but I don't have any better suggestion. Please add the
note that we can't enter RETENTION state at cluster/domain level when
all CPUs enter at CPU level.
You are correct, but I think the comment a few lines above (agreed to
be added by Lorenzo in the previous version) should be enough to
explain that. No?

The point is, this is only a problem if cluster RETENTION is
considered to be a shallower state that CPU power off, for example.
Yes, but give examples makes it better and helps people who may be
wondering why cluster retention state is not being entered. You can just
add to the above comment:

"e.g. If CPU Retention is one of the shallower state, then we can't enter
any of the allowed domain states."
Hmm, that it's not a correct statement I think, let me elaborate.

The problem is, that in case the CPU has both RETENTION and POWER OFF
(deepest CPU state), we would only be able to reach a cluster state
(RETENTION or POWER OFF) when the CPUs are in CPU POWER OFF (as that's
the deepest).
Sorry for the poor choice of words. What I meant is only one can be
deepest and it will be CPU POWER OFF if it exist at the CPU level.
RETENTION(again if exist) is shallower(rather deeper but not deepest
state).

This is okay, as long as a cluster RETENTION state is considered being
"deeper" than the CPU POWER OFF state. However, if that isn't the
case, it means  the cluster RETENTION state is not considered in the
correct order, but it's still possible to reach as a "domain state".
Again sorry for not being clear, I was referring CPU RET + CLUSTER RET.

I think this all is kind of summarized in the comment I agreed upon
with Lorenzo, but if you still think there is some clarification
needed I happy to add it.

Makes sense?
OK, if you happy, that's fine. I just wanted to clearly state CPU RET
+ CLUSTER RET is not possible with the implementation.
Okay!

I will then leave this as is. When/if you find a better wording of the
comment, you can always send a patch on top.

Kind regards
Uffe

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help