Thread (11 messages) 11 messages, 3 authors, 2017-08-24

Re: [PATCH 1/2] powerpc/workqueue: update list of possible CPUs

From: Tejun Heo <tj@kernel.org>
Date: 2017-08-22 16:54:44
Also in: linuxppc-dev, lkml

Hello, Michael.

On Tue, Aug 22, 2017 at 11:41:41AM +1000, Michael Ellerman wrote:
quoted
This is something powerpc needs to fix.
There is no way for us to fix it.
I don't think that's true.  The CPU id used in kernel doesn't have to
match the physical one and arch code should be able to pre-map CPU IDs
to nodes and use the matching one when hotplugging CPUs.  I'm not
saying that's the best way to solve the problem tho.  It could be that
the best way forward is making cpu <-> node mapping dynamic and
properly synchronized.  However, please note that that does mean we
mess up node affinity for things like per-cpu memory which are
allocated before the cpu comes up, so there's some inherent benefits
to keeping the mapping static even if that involves indirection.
quoted
Workqueue isn't the only one making this assumption. mm as a whole
assumes that CPU <-> node mapping is stable regardless of hotplug
events.
At least in this case I don't think the mapping changes, it's just we
don't know the mapping at boot.

Currently we have to report possible but not present CPUs as belonging
to node 0, because otherwise we trip this helpful piece of code:

	for_each_possible_cpu(cpu) {
		node = cpu_to_node(cpu);
		if (WARN_ON(node == NUMA_NO_NODE)) {
			pr_warn("workqueue: NUMA node mapping not available for cpu%d, disabling NUMA support\n", cpu);
			/* happens iff arch is bonkers, let's just proceed */
			return;
		}

But if we remove that, we could then accurately report NUMA_NO_NODE at
boot, and then update the mapping when the CPU is hotplugged.
If you think that making this dynamic is the right way to go, I have
no objection but we should be doing this properly instead of patching
up what seems to be crashing right now.  What synchronization and
notification mechanisms do we need to make cpu <-> node mapping
dynamic?  Do we need any synchronization in memory allocation paths?
If not, why would it be safe?

Thanks.

-- 
tejun
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help