Thread (57 messages) 57 messages, 7 authors, 2019-10-30

Re: [PATCH v6] numa: make node_to_cpumask_map() NUMA_NO_NODE aware

From: Michal Hocko <mhocko@kernel.org>
Date: 2019-09-24 12:25:10
Also in: linux-alpha, linux-mips, linux-s390, linux-sh, lkml, sparclinux

On Tue 24-09-19 14:09:43, Peter Zijlstra wrote:
On Tue, Sep 24, 2019 at 01:54:01PM +0200, Michal Hocko wrote:
quoted
On Tue 24-09-19 13:23:49, Peter Zijlstra wrote:
quoted
On Tue, Sep 24, 2019 at 12:56:22PM +0200, Michal Hocko wrote:
[...]
quoted
quoted
To be honest I really fail to see why to object to a simple semantic
that NUMA_NO_NODE imply all usable cpus. Could you explain that please?
Because it feels wrong. The device needs to be _somewhere_. It simply
cannot be node-less.
What if it doesn't have any numa preference for what ever reason? There
is no other way to express that than NUMA_NO_NODE.
Like I said; how does that physically work? The device needs to be
somewhere. It _must_ have a preference.
quoted
Anyway, I am not going to argue more about this because it seems more of
a discussion about "HW shouldn't be doing that although the specification
allows that" which cannot really have any outcome except of "feels
correct/wrong".
We can push back and say we don't respect the specification because it
is batshit insane ;-)
Here is my fingers crossed.

[...]
Now granted; there's a number of virtual devices that really don't have
a node affinity, but then, those are not hurt by forcing them onto a
random node, they really don't do anything. Like:
Do you really consider a random node a better fix than simply living
with a more robust NUMA_NO_NODE which tells the actual state? Page
allocator would effectivelly use the local node in that case. Any code
using the cpumask will know that any of the online cpus are usable.

Compare that to a wild guess that might be easily wrong and have subtle
side effects which are really hard to debug. You will only see a higher
utilization on a specific node. Good luck with a bug report like that.

Anyway, I really do  not feel strongly about that. If you really consider
it a bad idea then I can live with that. This just felt easier and
reasonably consistent to address. Implementing the guessing and fighting
vendors who really do not feel like providing a real affinity sounds
harder and more error prone.
-- 
Michal Hocko
SUSE Labs
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help