Re: [PATCH] powerpc/pseries/cpuhp: respect current SMT when adding new CPU
From: Laurent Dufour <hidden>
Date: 2023-03-30 15:52:40
Also in:
lkml
On 13/02/2023 16:40:50, Nathan Lynch wrote:
Michal Suchánek [off-list ref] writes:quoted
On Mon, Feb 13, 2023 at 08:46:50AM -0600, Nathan Lynch wrote:quoted
Laurent Dufour [off-list ref] writes:quoted
When a new CPU is added, the kernel is activating all its threads. This leads to weird, but functional, result when adding CPU on a SMT 4 system for instance. Here the newly added CPU 1 has 8 threads while the other one has 4 threads active (system has been booted with the 'smt-enabled=4' kernel option): ltcden3-lp12:~ # ppc64_cpu --info Core 0: 0* 1* 2* 3* 4 5 6 7 Core 1: 8* 9* 10* 11* 12* 13* 14* 15* There is no SMT value in the kernel. It is possible to run unbalanced LPAR with 2 threads for a CPU, 4 for another one, and 5 on the latest. To work around this possibility, and assuming that the LPAR run with the same number of threads for each CPU, which is the common case,I am skeptical at best of baking that assumption into this code. Mixed SMT modes within a partition doesn't strike me as an unreasonable possibility for some use cases. And if that's wrong, then we should just add a global smt value instead of using heuristics.quoted
the number of active threads of the CPU doing the hot-plug operation is computed. Only that number of threads will be activated for the newly added CPU. This way on a LPAR running in SMT=4, newly added CPU will be running 4 threads, which is what a end user would expect.I could see why most users would prefer this new behavior. But surely some users have come to expect the existing behavior, which has been in place for years, and developed workarounds that might be broken by this change? I would suggest that to handle this well, we need to give user space more ability to tell the kernel what actions to take on added cores, on an opt-in basis. This could take the form of extending the DLPAR sysfs command set: Option 1 - Add a flag that tells the kernel not to online any threads at all; user space will online the desired threads later. Option 2 - Add an option that tells the kernel which SMT mode to apply.powerpc-utils grew some drmgr hooks recently so maybe the policy can be moved to userspace?I'm not sure whether the hook mechanism would come into play, but yes, I am suggesting that user space be given the option of overriding the kernel's current behavior.
Indeed, that's not so easy. There are multiple ways for the SMT level to be impacted: - smt-enabled kernel option - smtstate systemctl service (if activated), saving SMT level at shutdown time to restore it a boot time - pseries-energyd daemon (if activated) could turn off threads - ppc64_cpu --smt=x user command - sysfs direct writing to turn off/on specific threads. There is no SMT level saved, on "disk" or in the kernel, and any of these options can interact in parallel. So from the user space point of view, the best we could do is looking for the SMT current values, there could be multiple values in the case of a mixed SMT state, peek one value and apply it. Extending the drmgr's hook is still valid, and I sent a patch series on the powerpc-utils mailing list to achieve that. However, changing the SMT level in that hook means that newly added CPU will be first turn on and there is a window where this threads could be seen active. Not a big deal but not turning on these extra threads looks better to me. That's being said, I can't see any benefit of a user space implementation compared to the option I'm proposing in that patch. Does anyone have a better idea? Cheers, Laurent.