RE: [PATCH v6 1/2] dma-contiguous: provide the ability to reserve per-numa CMA
From: Song Bao Hua (Barry Song) <hidden>
Date: 2020-08-21 08:29:43
Also in:
linux-iommu, lkml
-----Original Message----- From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel-owner@vger.kernel.org] On Behalf Of Randy Dunlap Sent: Friday, August 21, 2020 2:50 PM To: Song Bao Hua (Barry Song) <redacted>; hch@lst.de; m.szyprowski@samsung.com; robin.murphy@arm.com; will@kernel.org; ganapatrao.kulkarni@cavium.com; catalin.marinas@arm.com Cc: iommu@lists.linux-foundation.org; Linuxarm <redacted>; linux-arm-kernel@lists.infradead.org; linux-kernel@vger.kernel.org; huangdaode [off-list ref]; Jonathan Cameron [off-list ref]; Nicolas Saenz Julienne [off-list ref]; Steve Capper [off-list ref]; Andrew Morton [off-list ref]; Mike Rapoport [off-list ref] Subject: Re: [PATCH v6 1/2] dma-contiguous: provide the ability to reserve per-numa CMA On 8/20/20 7:26 PM, Barry Song wrote:quoted
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Will Deacon <will@kernel.org> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Ganapatrao Kulkarni <redacted> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Nicolas Saenz Julienne <redacted> Cc: Steve Capper <redacted> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Mike Rapoport <redacted> Signed-off-by: Barry Song <redacted> --- v6: rebase on top of 5.9-rc1; doc cleanup .../admin-guide/kernel-parameters.txt | 9 ++ include/linux/dma-contiguous.h | 6 ++ kernel/dma/Kconfig | 10 ++ kernel/dma/contiguous.c | 100++++++++++++++++--quoted
4 files changed, 115 insertions(+), 10 deletions(-)diff --git a/Documentation/admin-guide/kernel-parameters.txtb/Documentation/admin-guide/kernel-parameters.txtquoted
index bdc1f33fd3d1..3f33b89aeab5 100644--- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt@@ -599,6 +599,15 @@ altogether. For more information, see include/linux/dma-contiguous.h + pernuma_cma=nn[MG]memparse() allows any one of these suffixes: K, M, G, T, P, E and nothing in the option parsing function cares what suffix is used...
Hello Randy, Thanks for your comments. Actually I am following the suffix of default cma: cma=nn[MG]@[start[MG][-end[MG]]] [ARM,X86,KNL] Sets the size of kernel global memory area for contiguous memory allocations and optionally the placement constraint by the physical address range of memory allocations. A value of 0 disables CMA altogether. For more information, see include/linux/dma-contiguous.h I suggest users should set the size in either MB or GB as they set cma.
quoted
+ [ARM64,KNL] + Sets the size of kernel per-numa memory area for + contiguous memory allocations. A value of 0 disables + per-numa CMA altogether. DMA users on node nid will + first try to allocate buffer from the pernuma area + which is located in node nid, if the allocation fails, + they will fallback to the global default memory area. + cmo_free_hint= [PPC] Format: { yes | no } Specify whether pages are marked as being inactive when they are freed. This is used in CMO environmentsquoted
diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c index cff7e60968b9..89b95f10e56d 100644 --- a/kernel/dma/contiguous.c +++ b/kernel/dma/contiguous.c@@ -69,6 +69,19 @@ static int __init early_cma(char *p) } early_param("cma", early_cma); +#ifdef CONFIG_DMA_PERNUMA_CMA + +static struct cma *dma_contiguous_pernuma_area[MAX_NUMNODES]; +static phys_addr_t pernuma_size_bytes __initdata;why phys_addr_t? couldn't it just be unsigned long long?
Mainly because of following the programming habit in kernel/dma/contiguous.c:
I think the original code probably meant the size should not be larger than the MAXIMUM
value of phys_addr_t:
/*
* Default global CMA area size can be defined in kernel's .config.
* This is useful mainly for distro maintainers to create a kernel
* that works correctly for most supported systems.
* The size can be set in bytes or as a percentage of the total memory
* in the system.
*
* Users, who want to set the size of global CMA area for their system
* should use cma= kernel parameter.
*/
static const phys_addr_t size_bytes __initconst =
(phys_addr_t)CMA_SIZE_MBYTES * SZ_1M;
static phys_addr_t size_cmdline __initdata = -1;
static phys_addr_t base_cmdline __initdata;
static phys_addr_t limit_cmdline __initdata;
void __init dma_contiguous_reserve(phys_addr_t limit)
{
phys_addr_t selected_size = 0;
phys_addr_t selected_base = 0;
phys_addr_t selected_limit = limit;
bool fixed = false;
pr_debug("%s(limit %08lx)\n", __func__, (unsigned long)limit);
if (size_cmdline != -1) {
selected_size = size_cmdline;
selected_base = base_cmdline;
selected_limit = min_not_zero(limit_cmdline, limit);
if (base_cmdline + size_cmdline == limit_cmdline)
fixed = true;
if the whole file is using phys_addr_t for size, I don't want to make the new code weird.
OK, so cma_declare_contiguous_nid() uses phys_addr_t. Fine.quoted
+ +static int __init early_pernuma_cma(char *p) +{ + pernuma_size_bytes = memparse(p, &p); + return 0; +} +early_param("pernuma_cma", early_pernuma_cma); +#endif + #ifdef CONFIG_CMA_SIZE_PERCENTAGE static phys_addr_t __init __maybe_unusedcma_early_percent_memory(void)quoted
@@ -96,6 +109,34 @@ static inline __maybe_unused phys_addr_tcma_early_percent_memory(void)quoted
#endif +#ifdef CONFIG_DMA_PERNUMA_CMA +void __init dma_pernuma_cma_reserve(void) +{ + int nid; + + if (!pernuma_size_bytes) + return; + + for_each_node_state(nid, N_ONLINE) { + int ret; + char name[20]; + struct cma **cma = &dma_contiguous_pernuma_area[nid]; + + snprintf(name, sizeof(name), "pernuma%d", nid); + ret = cma_declare_contiguous_nid(0, pernuma_size_bytes, 0, 0, + 0, false, name, cma, nid); + if (ret) { + pr_warn("%s: reservation failed: err %d, node %d", __func__, + ret, nid); + continue; + } + + pr_debug("%s: reserved %llu MiB on node %d\n", __func__, + (unsigned long long)pernuma_size_bytes / SZ_1M, nid);Conversely, if you want to leave pernuma_size_bytes as phys_addr_t, you should use %pa (or %pap) to print it.
Here I think it is working as "size" in integer.
quoted
+ } +} +#endif
Thanks Barry _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel