Thread (8 messages) 8 messages, 3 authors, 2023-03-31

Re: [PATCH] powerpc/pseries/cpuhp: respect current SMT when adding new CPU

From: Laurent Dufour <hidden>
Date: 2023-03-30 15:52:40
Also in: lkml

On 13/02/2023 16:40:50, Nathan Lynch wrote:
Michal Suchánek [off-list ref] writes:
quoted
On Mon, Feb 13, 2023 at 08:46:50AM -0600, Nathan Lynch wrote:
quoted
Laurent Dufour [off-list ref] writes:
quoted
When a new CPU is added, the kernel is activating all its threads. This
leads to weird, but functional, result when adding CPU on a SMT 4 system
for instance.

Here the newly added CPU 1 has 8 threads while the other one has 4 threads
active (system has been booted with the 'smt-enabled=4' kernel option):

ltcden3-lp12:~ # ppc64_cpu --info
Core   0:    0*    1*    2*    3*    4     5     6     7
Core   1:    8*    9*   10*   11*   12*   13*   14*   15*

There is no SMT value in the kernel. It is possible to run unbalanced LPAR
with 2 threads for a CPU, 4 for another one, and 5 on the latest.

To work around this possibility, and assuming that the LPAR run with the
same number of threads for each CPU, which is the common case,
I am skeptical at best of baking that assumption into this code. Mixed
SMT modes within a partition doesn't strike me as an unreasonable
possibility for some use cases. And if that's wrong, then we should just
add a global smt value instead of using heuristics.
quoted
the number
of active threads of the CPU doing the hot-plug operation is computed. Only
that number of threads will be activated for the newly added CPU.

This way on a LPAR running in SMT=4, newly added CPU will be running 4
threads, which is what a end user would expect.
I could see why most users would prefer this new behavior. But surely
some users have come to expect the existing behavior, which has been in
place for years, and developed workarounds that might be broken by this
change?

I would suggest that to handle this well, we need to give user space
more ability to tell the kernel what actions to take on added cores, on
an opt-in basis.

This could take the form of extending the DLPAR sysfs command set:

Option 1 - Add a flag that tells the kernel not to online any threads at
all; user space will online the desired threads later.

Option 2 - Add an option that tells the kernel which SMT mode to apply.
powerpc-utils grew some drmgr hooks recently so maybe the policy can be
moved to userspace?
I'm not sure whether the hook mechanism would come into play, but yes, I
am suggesting that user space be given the option of overriding the
kernel's current behavior.
Indeed, that's not so easy. There are multiple ways for the SMT level to be
impacted:
 - smt-enabled kernel option
 - smtstate systemctl service (if activated), saving SMT level at shutdown
time to restore it a boot time
 - pseries-energyd daemon (if activated) could turn off threads
 - ppc64_cpu --smt=x user command
 - sysfs direct writing to turn off/on specific threads.

There is no SMT level saved, on "disk" or in the kernel, and any of these
options can interact in parallel. So from the user space point of view, the
best we could do is looking for the SMT current values, there could be
multiple values in the case of a mixed SMT state, peek one value and apply it.

Extending the drmgr's hook is still valid, and I sent a patch series on the
powerpc-utils mailing list to achieve that. However, changing the SMT level
in that hook means that newly added CPU will be first turn on and there is
a window where this threads could be seen active. Not a big deal but not
turning on these extra threads looks better to me.

That's being said, I can't see any benefit of a user space implementation
compared to the option I'm proposing in that patch.

Does anyone have a better idea?

Cheers,
Laurent.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help