Thread (61 messages) 61 messages, 5 authors, 2020-12-05

Re: [PATCH v4 12/14] arm64: Prevent offlining first CPU with 32-bit EL0 on mismatched system

From: Qais Yousef <hidden>
Date: 2020-12-02 13:00:58
Also in: linux-arch, lkml

On 12/01/20 22:13, Will Deacon wrote:
On Fri, Nov 27, 2020 at 01:41:22PM +0000, Qais Yousef wrote:
quoted
On 11/24/20 15:50, Will Deacon wrote:
quoted
If we want to support 32-bit applications, then when we identify a CPU
with mismatched 32-bit EL0 support we must ensure that we will always
have an active 32-bit CPU available to us from then on. This is important
for the scheduler, because is_cpu_allowed() will be constrained to 32-bit
CPUs for compat tasks and forced migration due to a hotplug event will
hang if no 32-bit CPUs are available.

On detecting a mismatch, prevent offlining of either the mismatching CPU
if it is 32-bit capable, or find the first active 32-bit capable CPU
otherwise.
                                       ^^^^^

You use cpumask_any_and(). Better use cpumask_first_and()? We have a truly
random function now, cpumask_any_and_distribute(), if you'd like to pick
something 'truly' random.
I think cpumask_any_and() is better, because it makes it clear that I don't
care about which CPU is chosen (and under the hood it ends up calling
cpumask_first_and() _anyway_). So this is purely cosmetic.
quoted
quoted
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 29017cbb6c8e..fe470683b43e 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1237,6 +1237,8 @@ has_cpuid_feature(const struct arm64_cpu_capabilities *entry, int scope)
 
 static int enable_mismatched_32bit_el0(unsigned int cpu)
 {
+	static int lucky_winner = -1;
+
 	struct cpuinfo_arm64 *info = &per_cpu(cpu_data, cpu);
 	bool cpu_32bit = id_aa64pfr0_32bit_el0(info->reg_id_aa64pfr0);
 
@@ -1245,6 +1247,22 @@ static int enable_mismatched_32bit_el0(unsigned int cpu)
 		static_branch_enable_cpuslocked(&arm64_mismatched_32bit_el0);
 	}
 
+	if (cpumask_test_cpu(0, cpu_32bit_el0_mask) == cpu_32bit)
+		return 0;
Hmm I'm struggling to get what you're doing here. You're treating CPU0 (the
boot CPU) specially here, but I don't get why?
If our ability to execute 32-bit code is the same as the boot CPU then we
don't have to do anything. That way, we can postpone nominating the lucky
winner until we really need to.
Okay I see what you're doing now. The '== cpu_32bit' part of the check gave me
trouble. If the first N cpus are 64bit only, we'll skip them here. Worth
a comment?

Wouldn't it be better to replace this with a check if cpu_32bit_el0_mask is
empty instead? That would be a lot easier to read.
quoted
quoted
+	if (lucky_winner >= 0)
+		return 0;
+
+	/*
+	 * We've detected a mismatch. We need to keep one of our CPUs with
+	 * 32-bit EL0 online so that is_cpu_allowed() doesn't end up rejecting
+	 * every CPU in the system for a 32-bit task.
+	 */
+	lucky_winner = cpu_32bit ? cpu : cpumask_any_and(cpu_32bit_el0_mask,
+							 cpu_active_mask);
cpumask_any_and() could return an error. It could be hard or even impossible to
trigger, but better check if lucky_winner is not >= nr_cpu_ids before calling
get_cpu_device(lucky_winner) to stay in the safe side and avoid a potential
splat?
I don't see how it can return an error here. There are two cases to
consider:

  1. The CPU being brought online is the first 32-bit-capable CPU. In which
     case, we don't use cpumask_any_and() at all.

  2. The CPU being brought online is the first 64-bit-only CPU. In which
     case, the CPU doing the onlining is 32-bit capable and will be in
     the active mask.
Now I understand the if condition above; yeah I think this will not hit.
The condition above simply guarantees cpu_32bit_el0_mask is not empty. And
since this is online path, there's a guarantee there's a single bit shared
between the 2 masks since the same path must have set this shared bit.
quoted
We can do better by the way and do smarter check in remove_cpu() to block
offlining the last aarch32 capable CPU without 'hardcoding' a specific cpu. But
won't insist and happy to wait for someone to come complaining this is not good
enough first.
I couldn't find a satisfactory way to do this without the possibility of
subtle races, so I'd prefer to keep it simple for the moment. In particular,
I wanted to make sure that somebody iterating over the cpu_possible_mask
and calling is_cpu_allowed(p, cpu) for each CPU and a 32-bit task can not
reach the end of the mask without ever getting a value of 'true'.

I'm open to revisiting this once some of this is merged, but right now
I don't think it's needed and it certainly adds complexity.
Agreed. I just wanted to share some awareness. Let's not make this series more
complicated than it needs to be.
quoted
Some vendors play games with hotplug to help with saving power. They might want
to dynamically nominate the last man standing 32bit capable CPU. Again, we can
wait for someone to complain first I guess.
The reality is that either all "big" cores or all "little" cores will be the
ones that are 32-bit capable, so I doubt it matters an awful lot which one
of the cluster is left online from a PM perspective. The real problem is
that a core has to be left online at all, but I don't think we can avoid
that.
I don't have specific info, but in theory it could matter. We enforce
a specific core to be always online, rather than allow any core to be offlined
until there's a single one left in the mask. I agree we shouldn't worry about
this for now.

And yes, we can't avoid keeping the last 32bit capable cpu online. The same way
we can't offline the last cpu in a normal system.

Thanks

--
Qais Yousef

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help