Re: [RFC PATCH 1/4] libnvdimm/namespace: Make namespace size validation arch dependent
From: Aneesh Kumar K.V <hidden>
Date: 2019-10-29 04:38:43
Also in:
nvdimm
On 10/29/19 4:38 AM, Dan Williams wrote:
On Mon, Oct 28, 2019 at 2:48 AM Aneesh Kumar K.V [off-list ref] wrote:quoted
The page size used to map the namespace is arch dependent. For example architectures like ppc64 use 16MB page size for direct-mapping. If the namespace size is not aligned to the mapping page size, we can observe kernel crash during namespace init and destroy. This is due to kernel doing partial map/unmap of the resource range BUG: Unable to handle kernel data access at 0xc001000406000000 Faulting instruction address: 0xc000000000090790 NIP [c000000000090790] arch_add_memory+0xc0/0x130 LR [c000000000090744] arch_add_memory+0x74/0x130 Call Trace: arch_add_memory+0x74/0x130 (unreliable) memremap_pages+0x74c/0xa30 devm_memremap_pages+0x3c/0xa0 pmem_attach_disk+0x188/0x770 nvdimm_bus_probe+0xd8/0x470 really_probe+0x148/0x570 driver_probe_device+0x19c/0x1d0 device_driver_attach+0xcc/0x100 bind_store+0x134/0x1c0 drv_attr_store+0x44/0x60 sysfs_kf_write+0x74/0xc0 kernfs_fop_write+0x1b4/0x290 __vfs_write+0x3c/0x70 vfs_write+0xd0/0x260 ksys_write+0xdc/0x130 system_call+0x5c/0x68 Signed-off-by: Aneesh Kumar K.V <redacted> --- arch/arm64/mm/flush.c | 11 +++++++++++ arch/powerpc/lib/pmem.c | 21 +++++++++++++++++++-- arch/x86/mm/pageattr.c | 12 ++++++++++++ include/linux/libnvdimm.h | 1 + 4 files changed, 43 insertions(+), 2 deletions(-)diff --git a/arch/arm64/mm/flush.c b/arch/arm64/mm/flush.c index ac485163a4a7..90c54c600023 100644 --- a/arch/arm64/mm/flush.c +++ b/arch/arm64/mm/flush.c@@ -91,4 +91,15 @@ void arch_invalidate_pmem(void *addr, size_t size) __inval_dcache_area(addr, size); } EXPORT_SYMBOL_GPL(arch_invalidate_pmem); + +unsigned long arch_validate_namespace_size(unsigned int ndr_mappings, unsigned long size) +{ + u32 remainder; + + div_u64_rem(size, PAGE_SIZE * ndr_mappings, &remainder); + if (remainder) + return PAGE_SIZE * ndr_mappings; + return 0; +} +EXPORT_SYMBOL_GPL(arch_validate_namespace_size); #endifdiff --git a/arch/powerpc/lib/pmem.c b/arch/powerpc/lib/pmem.c index 377712e85605..2e661a08dae5 100644 --- a/arch/powerpc/lib/pmem.c +++ b/arch/powerpc/lib/pmem.c@@ -17,14 +17,31 @@ void arch_wb_cache_pmem(void *addr, size_t size) unsigned long start = (unsigned long) addr; flush_dcache_range(start, start + size); } -EXPORT_SYMBOL(arch_wb_cache_pmem); +EXPORT_SYMBOL_GPL(arch_wb_cache_pmem); void arch_invalidate_pmem(void *addr, size_t size) { unsigned long start = (unsigned long) addr; flush_dcache_range(start, start + size); } -EXPORT_SYMBOL(arch_invalidate_pmem); +EXPORT_SYMBOL_GPL(arch_invalidate_pmem); + +unsigned long arch_validate_namespace_size(unsigned int ndr_mappings, unsigned long size) +{ + u32 remainder; + unsigned long linear_map_size; + + if (radix_enabled()) + linear_map_size = PAGE_SIZE; + else + linear_map_size = (1UL << mmu_psize_defs[mmu_linear_psize].shift);This seems more a "supported_alignments" problem, and less a namespace size or PAGE_SIZE problem, because if the starting address is misaligned this size validation can still succeed when it shouldn't.
Isn't supported_alignments an indication of how user want the namespace to be mapped to applications? Ie, with the above restrictions we can still do both 64K and 16M mapping of the namespace to userspace. Also for supported alignment the huge page mapping is further dependent on the THP feature. The restrictions here are mostly w.r.t the direct-mapping page size used by some architecture.
One problem is that __size_store() does not validate the size against the namespace alignment. However, the next problem is that alignment is a property of the pfn device, but not the raw namespace. I think this alignment constraint should be captured by exposing "align" and "supported_alignments" at the namespace level as the minimum alignment. The pfn level alignment could then be an additional alignment constraint, but ndctl would likely set them to the same value. The "* ndr_mappings" constraint should be left out of the interface, because that's a side effect of supporting namespace-type-blk aliasing which no platform seems to do in practice. If for some strange reason it's need in the future I'd rather expose the "aliasing" property rather than fold it into the align / supported_aligns interface.
-aneesh