[PATCH v8 01/11] ARM: brcmstb: add infrastructure for ARM-based Broadcom STB SoCs
From: computersforpeace@gmail.com (Brian Norris)
Date: 2014-08-13 23:47:12
Also in:
linux-devicetree, lkml
Hi Russell, Picking up this thread again, as things are now set for dropping this patch and resubmitting SMP support for 3.18. On Sat, Aug 02, 2014 at 10:27:56AM +0100, Russell King wrote:
On Thu, Jul 31, 2014 at 03:06:42PM -0700, Brian Norris wrote:quoted
Yes, I noticed this. What I meant is that smp_ops.cpu_die() and smp_ops.cpu_kill() are not synchronized....quoted
We're not relying on the L1 cache, though. Don't sync_cache_{r,w}() ensure all reads/writes reach at least the L2?What exactly are you trying to achieve here?
Synchronization between v7_exit_coherency_flush() (on the dying CPU) and yanking the power (brcmstb_cpu_kill(), on the murderous CPU). The core completion-based synchronization is not sufficient, since it allows brcmstb_smp_ops.smp_kill and brcmstb_smp_ops.smp_die to race. Am I somehow not achieving what I intend here?
quoted
How does that ensure that the CPU is down by the time the work is scheduled? It seems like this would just defer the work long enough that it *most likely* has quiesced, but I don't see how this gives us a better guarantee. Or maybe I'm missing something. (If so, please do enlighten!)Note that I said a delayed work queue. The dying CPU runs a predictable sequence once cpu_die() has been entered - interrupts at the GIC have been programmed to be routed to other CPUs, interrupts at the CPU are masked, so the CPU isn't going to be doing anything else except executing that code path. So, it's going to be a predictable number of CPU cycles. That allows you to arrange a delayed workqueue (or a timer) to fire after a period of time that you can guarantee that the dying CPU has reached that wfi().
OK, that sounds workable for the active hotplug case. But what about for the suspend case? CPUs are hot-unplugged during disable_nonboot_cpus(), and I don't see that this would guarantee the workqueue will complete before we enter suspend.
Another point which raises itself in your patch is this: + /* Settle-time from Broadcom-internal DVT reference code */ + udelay(7); 7us looks very precise, but udelay() may not be that precise. What is the actual specification? Does it say "you must wait at least 7us"? udelay() _may_ return early, especially if it is using the CPU delay loop to perform the delay - I've explained why this happens previously, and why it isn't a bug. If you're using a timer-based delay for udelay() (which you should be using if you support cpufreq) then the delay should be more accurate, but it's still good practise to give a little leeway on the figure.
I'm looking into this specific delay. I'd bet it's just "wait at least 7us." I could probably factor in some leeway to be safe. Thanks, Brian