Thread (56 messages) 56 messages, 10 authors, 2018-06-05

[PATCH v9 00/12] Support PPTT for ARM64

From: Jeremy Linton <hidden>
Date: 2018-05-29 21:52:10
Also in: linux-acpi, linux-renesas-soc, linux-riscv, lkml
Subsystem: arm64 port (aarch64 architecture), the rest · Maintainers: Catalin Marinas, Will Deacon, Linus Torvalds

Hi,

On 05/29/2018 10:51 AM, Geert Uytterhoeven wrote:
Hi Will,

On Tue, May 29, 2018 at 5:08 PM, Will Deacon [off-list ref] wrote:
quoted
On Tue, May 29, 2018 at 02:18:40PM +0100, Sudeep Holla wrote:
quoted
On 29/05/18 12:56, Geert Uytterhoeven wrote:
quoted
On Tue, May 29, 2018 at 1:14 PM, Sudeep Holla [off-list ref] wrote:
quoted
On 29/05/18 11:48, Geert Uytterhoeven wrote:
quoted
System supend still works fine on systems with big cores only:

     R-Car H3 ES1.0 (4xCA57 (4xCA53 disabled in firmware))
     R-Car M3-N (2xCA57)

Reverting this commit fixes the issue for me.
I can't find anything that relates to system suspend in these patches
unless they are messing with something during CPU hot plug-in back
during resume.
It's only the last patch that introduces the breakage.
As specified in the commit log, it won't change any behavior for DT
systems if it's non-NUMA or single node system. So I am still wondering
what could trigger this regression.
I wonder if we're somehow giving an uninitialised/invalid NUMA configuration
to the scheduler, although I can't see how this would happen.

Geert -- if you enable CONFIG_DEBUG_PER_CPU_MAPS=y and apply the diff below
do you see anything shouting in dmesg?
Thanks, but unfortunately it doesn't help.
I added some debug code to print cpumask, but so far I don't see anything
suspicious.
I suspect most of the problem is related to the node mask changing at 
unexpected times (particularly cores being removed from the mask). Once 
I understand that more, there may be a simpler patch.

OTOH, I've been testing with this, and with it, I can't seem to 
duplicate the problem with CONFIG_NUMA disabled I found.
diff --git a/arch/arm64/include/asm/topology.h 
b/arch/arm64/include/asm/topology.h
index df48212f767b..7450ef5ed733 100644
--- a/arch/arm64/include/asm/topology.h
+++ b/arch/arm64/include/asm/topology.h
@@ -12,6 +12,7 @@ struct cpu_topology {
         cpumask_t thread_sibling;
         cpumask_t core_sibling;
         cpumask_t llc_siblings;
+       cpumask_t node_siblings; /* maintain a stable node sibling list */
  };

  extern struct cpu_topology cpu_topology[NR_CPUS];
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index f3e2e3aec0b0..f4eb80852d78 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -677,8 +677,9 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
         init_cpu_topology();

         this_cpu = smp_processor_id();
-       store_cpu_topology(this_cpu);
         numa_store_cpu_info(this_cpu);
+       store_cpu_topology(this_cpu);
+

         /*
          * If UP is mandated by "nosmp" (which implies "maxcpus=0"), 
don't set
diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index 7415c166281f..6819c764537d 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -215,7 +215,7 @@ EXPORT_SYMBOL_GPL(cpu_topology);

  const struct cpumask *cpu_coregroup_mask(int cpu)
  {
-       const cpumask_t *core_mask = cpumask_of_node(cpu_to_node(cpu));
+       const cpumask_t *core_mask = &cpu_topology[cpu].node_siblings;

         /* Find the smaller of NUMA, core or LLC siblings */
         if (cpumask_subset(&cpu_topology[cpu].core_sibling, core_mask)) {
@@ -233,12 +233,16 @@ const struct cpumask *cpu_coregroup_mask(int cpu)
  static void update_siblings_masks(unsigned int cpuid)
  {
         struct cpu_topology *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
+       int node = cpu_to_node(cpuid);
         int cpu;

         /* update core and thread sibling masks */
         for_each_possible_cpu(cpu) {
                 cpu_topo = &cpu_topology[cpu];

+               if (cpu_to_node(cpu) == node)
+                       cpumask_set_cpu(cpu, &cpu_topo->node_siblings);
+
                 if (cpuid_topo->llc_id == cpu_topo->llc_id)
                         cpumask_set_cpu(cpu, &cpuid_topo->llc_siblings);
@@ -311,6 +315,9 @@ static void __init reset_cpu_topology(void)
                 cpumask_clear(&cpu_topo->llc_siblings);
                 cpumask_set_cpu(cpu, &cpu_topo->llc_siblings);

+               cpumask_clear(&cpu_topo->node_siblings);
+               cpumask_set_cpu(cpu, &cpu_topo->node_siblings);
+
                 cpumask_clear(&cpu_topo->core_sibling);
                 cpumask_set_cpu(cpu, &cpu_topo->core_sibling);
                 cpumask_clear(&cpu_topo->thread_sibling);
-- 
2.14.3
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help