Thread (57 messages) 57 messages, 7 authors, 2019-10-30

Re: [PATCH v6] numa: make node_to_cpumask_map() NUMA_NO_NODE aware

From: Peter Zijlstra <peterz@infradead.org>
Date: 2019-09-24 11:25:40
Also in: linux-mips, linux-s390, linux-sh, linuxppc-dev, lkml, sparclinux

On Tue, Sep 24, 2019 at 12:56:22PM +0200, Michal Hocko wrote:
On Tue 24-09-19 11:17:14, Peter Zijlstra wrote:
quoted
On Tue, Sep 24, 2019 at 09:47:51AM +0200, Michal Hocko wrote:
quoted
On Mon 23-09-19 22:34:10, Peter Zijlstra wrote:
quoted
On Mon, Sep 23, 2019 at 06:52:35PM +0200, Michal Hocko wrote:
[...]
quoted
quoted
I even the
ACPI standard is considering this optional. Yunsheng Lin has referred to
the specific part of the standard in one of the earlier discussions.
Trying to guess the node affinity is worse than providing all CPUs IMHO.
I'm saying the ACPI standard is wrong.
Even if you were right on this the reality is that a HW is likely to
follow that standard and we cannot rule out NUMA_NO_NODE being
specified. As of now we would access beyond the defined array and that
is clearly a bug.
Right, because the device node is wrong, so we fix _that_!
quoted
Let's assume that this is really a bug for a moment. What are you going
to do about that? BUG_ON? I do not really see any solution besides to either
provide something sensible or BUG_ON. If you are worried about a
conditional then this should be pretty easy to solve by starting the
array at -1 index and associate it with the online cpu mask.
The same thing I proposed earlier; force the device node to 0 (or any
other convenient random valid value) and issue a FW_BUG message to the
console.
Why would you "fix" anything and how do you know that node 0 is the
right choice? I have seen setups with node 0 without any memory and
similar unexpected things.
We don't know 0 is right; but we know 'unkown' is wrong, so we FW_BUG
and pick _something_.
To be honest I really fail to see why to object to a simple semantic
that NUMA_NO_NODE imply all usable cpus. Could you explain that please?
Because it feels wrong. The device needs to be _somewhere_. It simply
cannot be node-less.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help