Re: [PATCH v2 0/1] powerpc/numa: Make cpu/memory less numa-node online
From: Michael Ellerman <hidden>
Date: 2024-06-20 12:52:58
On Fri, 17 May 2024 19:55:21 +0530, Nilay Shroff wrote:
On NUMA aware system, we make a numa-node online only if that node is attached to cpu/memory. However it's possible that we have some PCI/IO device affinitized to a numa-node which is not currently online. In such case we set the numa-node id of the corresponding PCI device to -1 (NUMA_NO_NODE). Not assigning the correct numa-node id to PCI device may impact the performance of such device. For instance, we have a multi controller NVMe disk where each controller of the disk is attached to different PHB (PCI host bridge). Each of these PHBs has numa-node id assigned during PCI enumeration. During PCI enumeration if we find that the numa-node is not online then we set the numa-node id of the PHB to -1. If we create shared namespace and attach to multi controller NVMe disk then that namespace could be accessed through each controller and as each controller is connected to different PHBs, it's possible to access the same namespace using multiple PCI channel. While sending IO to a shared namespace, NVMe driver would calculate the optimal IO path using numa-node distance. However if the numa-node id is not correctly assigned to NVMe PCIe controller then it's possible that driver would calculate incorrect NUMA distance and hence select the non-optimal path for sending IO. If this happens then we could potentially observe the degraded IO performance. [...]
Applied to powerpc/next.
[1/1] powerpc/numa: Online a node if PHB is attached.
https://git.kernel.org/powerpc/c/11981816e3614156a1fe14a1e8e77094ea46c7d5
cheers