RE: [PATCH v4 1/2] dma-direct: provide the ability to reserve per-numa CMA
From: Song Bao Hua (Barry Song) <hidden>
Date: 2020-07-28 12:19:18
Also in:
linux-iommu, lkml
-----Original Message----- From: Christoph Hellwig [mailto:hch@lst.de] Sent: Tuesday, July 28, 2020 11:53 PM To: Song Bao Hua (Barry Song) <redacted> Cc: hch@lst.de; m.szyprowski@samsung.com; robin.murphy@arm.com; will@kernel.org; ganapatrao.kulkarni@cavium.com; catalin.marinas@arm.com; iommu@lists.linux-foundation.org; Linuxarm [off-list ref]; linux-arm-kernel@lists.infradead.org; linux-kernel@vger.kernel.org; Zengtao (B) [off-list ref]; huangdaode [off-list ref]; Jonathan Cameron [off-list ref]; Nicolas Saenz Julienne [off-list ref]; Steve Capper [off-list ref]; Andrew Morton [off-list ref]; Mike Rapoport [off-list ref] Subject: Re: [PATCH v4 1/2] dma-direct: provide the ability to reserve per-numa CMA On Fri, Jul 24, 2020 at 01:13:43AM +1200, Barry Song wrote:quoted
+config CMA_PERNUMA_SIZE_MBYTES + int "Size in Mega Bytes for per-numa CMA areas" + depends on NUMA + default 16 if ARM64 + default 0 + help + Defines the size (in MiB) of the per-numa memory area for Contiguous + Memory Allocator. Every numa node will get a separate CMA with this + size. If the size of 0 is selected, per-numa CMA is disabled.I'm still not a fan of the config option. You can just hardcode the value in CONFIG_CMDLINE based on the kernel parameter. Also I wonder
I am sorry I haven't got your point yet. Do you mean something like the below? arch/arm64/Kconfig: config CMDLINE string "Default kernel command string" - default "" + default "pernuma_cma=16M" help Provide a set of default command-line options at build time by entering them here. As a minimum, you should specify the the root device (e.g. root=/dev/nfs). A background of the current code is that Linux distributions can usually use arch/arm64/configs/defconfig directly to build kernel. cmdline can be easily ignored during the generation of Linux distributions.
if a way to expose this in the device tree might be useful, but people more familiar with the device tree and the arm code will have to chime in on that.
Not sure if it is an useful user case as we are using ACPI but not device tree since it is an ARM64 server with NUMA.
quoted
struct cma *dma_contiguous_default_area; +static struct cma *dma_contiguous_pernuma_area[MAX_NUMNODES]; /* * Default global CMA area size can be defined in kernel's .config.@@ -44,6 +51,8 @@ struct cma *dma_contiguous_default_area; */ static const phys_addr_t size_bytes __initconst = (phys_addr_t)CMA_SIZE_MBYTES * SZ_1M; +static phys_addr_t pernuma_size_bytes __initdata = + (phys_addr_t)CMA_SIZE_PERNUMA_MBYTES * SZ_1M; static phys_addr_t size_cmdline __initdata = -1; static phys_addr_t base_cmdline __initdata; static phys_addr_t limit_cmdline __initdata;@@ -69,6 +78,13 @@ static int __init early_cma(char *p) } early_param("cma", early_cma); +static int __init early_pernuma_cma(char *p) +{ + pernuma_size_bytes = memparse(p, &p); + return 0; +} +early_param("pernuma_cma", early_pernuma_cma); + #ifdef CONFIG_CMA_SIZE_PERCENTAGE static phys_addr_t __init __maybe_unusedcma_early_percent_memory(void)quoted
@@ -96,6 +112,33 @@ static inline __maybe_unused phys_addr_tcma_early_percent_memory(void)quoted
#endif +void __init dma_pernuma_cma_reserve(void) +{ + int nid; + + if (!pernuma_size_bytes) + return; + + for_each_node_state(nid, N_MEMORY) { + int ret; + char name[20]; + + snprintf(name, sizeof(name), "pernuma%d", nid); + ret = cma_declare_contiguous_nid(0, pernuma_size_bytes, 0, 0, + 0, false, name, + &dma_contiguous_pernuma_area[nid], + nid);This adds a > 80 char line.
Will refine.
quoted
struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_tgfp)quoted
{ + int nid = dev_to_node(dev); + /* CMA can be used only in the context which permits sleeping */ if (!gfpflags_allow_blocking(gfp)) return NULL; if (dev->cma_area) return cma_alloc_aligned(dev->cma_area, size, gfp); - if (size <= PAGE_SIZE || !dma_contiguous_default_area) + if (size <= PAGE_SIZE) return NULL; + + if ((nid != NUMA_NO_NODE) && !(gfp & (GFP_DMA | GFP_DMA32))) {No need for the braces around the nid check.
Will refine.
quoted
+ struct cma *cma = dma_contiguous_pernuma_area[nid]; + struct page *page; + + if (cma) { + page = cma_alloc_aligned(cma, size, gfp); + if (page) + return page; + } + } + return cma_alloc_aligned(dma_contiguous_default_area, size, gfp);This seems to have lost the dma_contiguous_default_area NULL check.
cma_alloc() is doing the check by returning NULL if cma is NULL.
struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
bool no_warn)
{
...
if (!cma || !cma->count)
return NULL;
}
But I agree here the code can check before calling cma_alloc_aligned.
quoted
+ /* if dev has its own cma, free page from there */ + if (dev->cma_area) { + if (cma_release(dev->cma_area, page, PAGE_ALIGN(size) >>PAGE_SHIFT))quoted
+ return;Another overly long line.
Will refine.
quoted
+ } else { + /* + * otherwise, page is from either per-numa cma or default cma + */ + if (cma_release(dma_contiguous_pernuma_area[page_to_nid(page)], + page, PAGE_ALIGN(size) >> PAGE_SHIFT)) + return; + + if (cma_release(dma_contiguous_default_area, page, + PAGE_ALIGN(size) >> PAGE_SHIFT)) + return; + }I'd introduce a count variable for the value of "PAGE_ALIGN(size) >> PAGE_SHIFT" to clean al lthis up a bit.
Good idea.
Also please add a CONFIG_PERCPU_DMA_CMA config variable so that we don't build this code for the vast majority of users that don't need it.
agreed. Thanks Barry _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel