Thread (14 messages) 14 messages, 4 authors, 2020-08-31

RE: [PATCH v6 1/2] dma-contiguous: provide the ability to reserve per-numa CMA

From: Song Bao Hua (Barry Song) <hidden>
Date: 2020-08-21 08:29:43
Also in: linux-iommu, lkml

-----Original Message-----
From: linux-kernel-owner@vger.kernel.org
[mailto:linux-kernel-owner@vger.kernel.org] On Behalf Of Randy Dunlap
Sent: Friday, August 21, 2020 2:50 PM
To: Song Bao Hua (Barry Song) <redacted>; hch@lst.de;
m.szyprowski@samsung.com; robin.murphy@arm.com; will@kernel.org;
ganapatrao.kulkarni@cavium.com; catalin.marinas@arm.com
Cc: iommu@lists.linux-foundation.org; Linuxarm <redacted>;
linux-arm-kernel@lists.infradead.org; linux-kernel@vger.kernel.org;
huangdaode [off-list ref]; Jonathan Cameron
[off-list ref]; Nicolas Saenz Julienne
[off-list ref]; Steve Capper [off-list ref]; Andrew
Morton [off-list ref]; Mike Rapoport [off-list ref]
Subject: Re: [PATCH v6 1/2] dma-contiguous: provide the ability to reserve
per-numa CMA

On 8/20/20 7:26 PM, Barry Song wrote:
quoted

Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Will Deacon <will@kernel.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Ganapatrao Kulkarni <redacted>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Nicolas Saenz Julienne <redacted>
Cc: Steve Capper <redacted>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Mike Rapoport <redacted>
Signed-off-by: Barry Song <redacted>
---
 v6: rebase on top of 5.9-rc1;
     doc cleanup

 .../admin-guide/kernel-parameters.txt         |   9 ++
 include/linux/dma-contiguous.h                |   6 ++
 kernel/dma/Kconfig                            |  10 ++
 kernel/dma/contiguous.c                       | 100
++++++++++++++++--
quoted
 4 files changed, 115 insertions(+), 10 deletions(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt
b/Documentation/admin-guide/kernel-parameters.txt
quoted
index bdc1f33fd3d1..3f33b89aeab5 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -599,6 +599,15 @@
 			altogether. For more information, see
 			include/linux/dma-contiguous.h

+	pernuma_cma=nn[MG]
memparse() allows any one of these suffixes: K, M, G, T, P, E
and nothing in the option parsing function cares what suffix is used...
Hello Randy,
Thanks for your comments.

Actually I am following the suffix of default cma:
	cma=nn[MG]@[start[MG][-end[MG]]]
			[ARM,X86,KNL]
			Sets the size of kernel global memory area for
			contiguous memory allocations and optionally the
			placement constraint by the physical address range of
			memory allocations. A value of 0 disables CMA
			altogether. For more information, see
			include/linux/dma-contiguous.h

I suggest users should set the size in either MB or GB as they set cma. 
quoted
+			[ARM64,KNL]
+			Sets the size of kernel per-numa memory area for
+			contiguous memory allocations. A value of 0 disables
+			per-numa CMA altogether. DMA users on node nid will
+			first try to allocate buffer from the pernuma area
+			which is located in node nid, if the allocation fails,
+			they will fallback to the global default memory area.
+
 	cmo_free_hint=	[PPC] Format: { yes | no }
 			Specify whether pages are marked as being inactive
 			when they are freed.  This is used in CMO environments
quoted
diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
index cff7e60968b9..89b95f10e56d 100644
--- a/kernel/dma/contiguous.c
+++ b/kernel/dma/contiguous.c
@@ -69,6 +69,19 @@ static int __init early_cma(char *p)
 }
 early_param("cma", early_cma);

+#ifdef CONFIG_DMA_PERNUMA_CMA
+
+static struct cma *dma_contiguous_pernuma_area[MAX_NUMNODES];
+static phys_addr_t pernuma_size_bytes __initdata;
why phys_addr_t? couldn't it just be unsigned long long?
Mainly because of following the programming habit in kernel/dma/contiguous.c:
I think the original code probably meant the size should not be larger than the MAXIMUM
value of phys_addr_t:

/*
 * Default global CMA area size can be defined in kernel's .config.
 * This is useful mainly for distro maintainers to create a kernel
 * that works correctly for most supported systems.
 * The size can be set in bytes or as a percentage of the total memory
 * in the system.
 *
 * Users, who want to set the size of global CMA area for their system
 * should use cma= kernel parameter.
 */
static const phys_addr_t size_bytes __initconst =
	(phys_addr_t)CMA_SIZE_MBYTES * SZ_1M;
static phys_addr_t  size_cmdline __initdata = -1;
static phys_addr_t base_cmdline __initdata;
static phys_addr_t limit_cmdline __initdata;

void __init dma_contiguous_reserve(phys_addr_t limit)
{
	phys_addr_t selected_size = 0;
	phys_addr_t selected_base = 0;
	phys_addr_t selected_limit = limit;
	bool fixed = false;

	pr_debug("%s(limit %08lx)\n", __func__, (unsigned long)limit);

	if (size_cmdline != -1) {
		selected_size = size_cmdline;
		selected_base = base_cmdline;
		selected_limit = min_not_zero(limit_cmdline, limit);
		if (base_cmdline + size_cmdline == limit_cmdline)
			fixed = true;

if the whole file is using phys_addr_t for size, I don't want to make the new code weird.
OK, so cma_declare_contiguous_nid() uses phys_addr_t. Fine.
quoted
+
+static int __init early_pernuma_cma(char *p)
+{
+	pernuma_size_bytes = memparse(p, &p);
+	return 0;
+}
+early_param("pernuma_cma", early_pernuma_cma);
+#endif
+
 #ifdef CONFIG_CMA_SIZE_PERCENTAGE

 static phys_addr_t __init __maybe_unused
cma_early_percent_memory(void)
quoted
@@ -96,6 +109,34 @@ static inline __maybe_unused phys_addr_t
cma_early_percent_memory(void)
quoted
 #endif

+#ifdef CONFIG_DMA_PERNUMA_CMA
+void __init dma_pernuma_cma_reserve(void)
+{
+	int nid;
+
+	if (!pernuma_size_bytes)
+		return;
+
+	for_each_node_state(nid, N_ONLINE) {
+		int ret;
+		char name[20];
+		struct cma **cma = &dma_contiguous_pernuma_area[nid];
+
+		snprintf(name, sizeof(name), "pernuma%d", nid);
+		ret = cma_declare_contiguous_nid(0, pernuma_size_bytes, 0, 0,
+						 0, false, name, cma, nid);
+		if (ret) {
+			pr_warn("%s: reservation failed: err %d, node %d", __func__,
+				ret, nid);
+			continue;
+		}
+
+		pr_debug("%s: reserved %llu MiB on node %d\n", __func__,
+			(unsigned long long)pernuma_size_bytes / SZ_1M, nid);
Conversely, if you want to leave pernuma_size_bytes as phys_addr_t,
you should use %pa (or %pap) to print it.
Here I think it is working as "size" in integer.
quoted
+	}
+}
+#endif
Thanks
Barry

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help