Re: [PATCH 1/2] powerpc/workqueue: update list of possible CPUs
From: Laurent Vivier <hidden>
Date: 2017-08-24 12:10:36
Also in:
linuxppc-dev, lkml
On 23/08/2017 15:26, Tejun Heo wrote:
Hello, Michael. On Wed, Aug 23, 2017 at 09:00:39PM +1000, Michael Ellerman wrote:quoted
quoted
I don't think that's true. The CPU id used in kernel doesn't have to match the physical one and arch code should be able to pre-map CPU IDs to nodes and use the matching one when hotplugging CPUs. I'm not saying that's the best way to solve the problem tho.We already virtualise the CPU numbers, but not the node IDs. And it's the node IDs that are really the problem.Yeah, it just needs to match up new cpus to the cpu ids assigned to the right node.
We are not able to assign the cpu ids to the right node before the CPU is present, because firmware doesn't provide CPU mapping <-> node id before that.
quoted
quoted
It could be that the best way forward is making cpu <-> node mapping dynamic and properly synchronized.We don't need it to be dynamic (at least for this bug).The node mapping for that cpu id changes *dynamically* while the system is running and that can race with node-affinity sensitive operations such as memory allocations.
Memory is mapped to the node through its own firmware entry, so I don't think cpu id change can affect memory affinity, and before we know the node id of the CPU, the CPU is not present and thus it can't use memory.
quoted
Laurent is booting Qemu with a fixed CPU <-> Node mapping, it's just that because some CPUs aren't present at boot we don't know what the node mapping is. (Correct me if I'm wrong Laurent). So all we need is: - the workqueue code to cope with CPUs that are possible but not online having NUMA_NO_NODE to begin with. - a way to update the workqueue cpumask when the CPU comes online. Which seems reasonable to me?Please take a step back and think through the problem again. You can't bandaid it this way.
Could you give some ideas, proposals? As the firmware doesn't provide the information before the CPU is really plugged, I really don't know how to manage this problem. Thanks, Laurent