Re: [PATCH v2 15/17] libnvdimm: Set numa_node to NVDIMM devices
From: Toshi Kani <hidden>
Date: 2015-06-25 21:51:43
Also in:
linux-fsdevel, lkml, nvdimm
On Thu, 2015-06-25 at 14:31 -0700, Dan Williams wrote:
On Thu, Jun 25, 2015 at 11:34 AM, Williams, Dan J [off-list ref] wrote:quoted
On Thu, 2015-06-25 at 11:45 -0600, Toshi Kani wrote:quoted
On Thu, 2015-06-25 at 05:37 -0400, Dan Williams wrote:quoted
From: Toshi Kani <redacted> ACPI NFIT table has System Physical Address Range Structure entries that describe a proximity ID of each range when ACPI_NFIT_PROXIMITY_VALID is set in the flags. Change acpi_nfit_register_region() to map a proximity ID to its node ID, and set it to a new numa_node field of nd_region_desc, which is then conveyed to the nd_region device. The device core arranges for btt and namespace devices to inherit their node from their parent region. Signed-off-by: Toshi Kani <redacted> [djbw: move set_dev_node() from region 'probe' to 'create']Sorry, I failed to mention other issue, which led me call set_dev_node() in probe. nd_async_device_register() calls device_add(), which does: /* use parent numa_node */ if (parent) set_dev_node(dev, dev_to_node(parent)); and overwrites numa_node to -1. Since region's parent is ndbusN, we cannot set numa_node to the parent. So, I had to set it in probe.In general, I still don't like leaving it up to ->probe() which is within its rights to fail and not set the node. How about the following that moves it to the bus uevent code? Should get triggered before probe so the numa_node is valid before userspace is ever notified about the device. device_add() does: kobject_uevent(&dev->kobj, KOBJ_ADD); bus_probe_device(dev); ...so I think we're good, agree? I also added a missing init of ndr_desc.numa_node in arch/x86/kernel/pmem.c, see below.This looks good in a quick manual test. It's interesting/illustrative that I inadvertently broke the one bit of the libnvdimm sysfs interface that did not have unit test coverage.
Sorry I had some interrupt. Yes, this works fine for region & namespace. I'd like to check with you for btt since the attach logic has changed in v2. Previously, as described in patch 16/17, bttN bound to pmem had a valid numa_node value, and seeding btt0 had -1. /sys/bus/nd/devices |-- btt0/numa_node:-1 |-- btt1/numa_node:0 In this version, there are unbound (seeding?) btt0-3 for every region (there are 4 regions) and btt4 & 5 bound to pmem0 & 3 on my system. btt0/numa_node:0 btt1/numa_node:0 btt2/numa_node:1 btt3/numa_node:1 btt4/numa_node:0 btt5/numa_node:1 btt0 -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region0/btt0 btt1 -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region1/btt1 btt2 -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region2/btt2 btt3 -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region3/btt3 btt4 -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region0/btt4 btt5 -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region3/btt5 And unbound bttNs attach to different regions across a reboot. btt0/numa_node:0 btt1/numa_node:1 btt2/numa_node:1 btt3/numa_node:0 btt4/numa_node:0 btt5/numa_node:1 btt0 -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region0/btt0 btt1 -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region3/btt1 btt2 -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region2/btt2 btt3 -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region1/btt3 btt4 -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region0/btt4 btt5 -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region3/btt5 Is this how you'd expect btt to work in this version? (I have not looked at the btt changes yet) Thanks, -Toshi