Thread (98 messages) 98 messages, 6 authors, 2023-12-18

Re: [PATCH RFC v2 11/27] arm64: mte: Reserve tag storage memory

From: Hyesoo Yu <hidden>
Date: 2023-12-08 05:15:33
Also in: kvmarm, linux-arch, linux-fsdevel, linux-mm, linux-trace-kernel, lkml

Hi, 

I'm sorry for the late response, I was on vacation.

On Sun, Dec 03, 2023 at 12:14:30PM +0000, Alexandru Elisei wrote:
Hi,

On Wed, Nov 29, 2023 at 05:44:24PM +0900, Hyesoo Yu wrote:
quoted
Hello.

On Sun, Nov 19, 2023 at 04:57:05PM +0000, Alexandru Elisei wrote:
quoted
Allow the kernel to get the size and location of the MTE tag storage
regions from the DTB. This memory is marked as reserved for now.

The DTB node for the tag storage region is defined as:

        tags0: tag-storage@8f8000000 {
                compatible = "arm,mte-tag-storage";
                reg = <0x08 0xf8000000 0x00 0x4000000>;
                block-size = <0x1000>;
                memory = <&memory0>;	// Associated tagged memory node
        };
How about using compatible = "shared-dma-pool" like below ?

&reserved_memory {
	tags0: tag0@8f8000000 {
		compatible = "arm,mte-tag-storage";
        	reg = <0x08 0xf8000000 0x00 0x4000000>;
	};
}

tag-storage {
        compatible = "arm,mte-tag-storage";
	memory-region = <&tag>;
        memory = <&memory0>;
	block-size = <0x1000>;
}

And then, the activation of CMA would be performed in the CMA code.
We just can get the region information from memory-region and allocate it directly
like alloc_contig_range, take_page_off_buddy. It seems like we can remove a lots of code.
Sorry, that example was my mistake. Actually I wanted to write like this. 

&reserved_memory {
	tags0: tag0@8f8000000 {
		compatible = "shared-dma-pool";
        	reg = <0x08 0xf8000000 0x00 0x4000000>;
		reusable;
	};
}

tag-storage {
        compatible = "arm,mte-tag-storage";
 	memory-region = <&tag>;
        memory = <&memory0>;
	block-size = <0x1000>;
}

Played with reserved_mem a bit. I don't think that's the correct path
forward.

The location of the tag storage is a hardware property, independent of how
Linux is configured.

early_init_fdt_scan_reserved_mem() is called from arm64_memblock_init(),
**after** the kernel enforces an upper address for various reasons. One of
the reasons can be that it's been compiled with 39 bits VA.
I'm not sure about this part. What is the upper address enforced by the kernel ?
Where can I check the code ? Do you means that memblock_end_of_DRAM() ?  
quoted hunk ↗ jump to hunk
After early_init_fdt_scan_reserved_mem() returns, the kernel sets the
maximum address, stored in the variable "high_memory".

What can happen is that tag storage is present at an address above the
maximum addressable by the kernel, and the CMA code will trigger an
unrecovrable page fault.

I was able to trigger this with the dts change:
diff --git a/arch/arm64/boot/dts/arm/fvp-base-revc.dts b/arch/arm64/boot/dts/arm/fvp-base-revc.dts
index 60472d65a355..201359d014e4 100644
--- a/arch/arm64/boot/dts/arm/fvp-base-revc.dts
+++ b/arch/arm64/boot/dts/arm/fvp-base-revc.dts
@@ -183,6 +183,13 @@ vram: vram@18000000 {
                        reg = <0x00000000 0x18000000 0 0x00800000>;
                        no-map;
                };
+
+
+               linux,cma {
+                       compatible = "shared-dma-pool";
+                       reg = <0x100 0x0 0x00 0x4000000>;
+                       reusable;
+               };
        };

        gic: interrupt-controller@2f000000 {
And the error I got:

[    0.000000] Reserved memory: created CMA memory pool at 0x0000010000000000, size 64 MiB
[    0.000000] OF: reserved mem: initialized node linux,cma, compatible id shared-dma-pool
[    0.000000] OF: reserved mem: 0x0000010000000000..0x0000010003ffffff (65536 KiB) map reusable linux,cma
[..]
[    0.793193] WARNING: CPU: 0 PID: 1 at mm/cma.c:111 cma_init_reserved_areas+0xa8/0x378
[..]
[    0.806945] Unable to handle kernel paging request at virtual address 00000001fe000000
[    0.807277] Mem abort info:
[    0.807277]   ESR = 0x0000000096000005
[    0.807693]   EC = 0x25: DABT (current EL), IL = 32 bits
[    0.808110]   SET = 0, FnV = 0
[    0.808443]   EA = 0, S1PTW = 0
[    0.808526]   FSC = 0x05: level 1 translation fault
[    0.808943] Data abort info:
[    0.808943]   ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
[    0.809360]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[    0.809776]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[    0.810221] [00000001fe000000] user address but active_mm is swapper
[..]
[    0.820887] Call trace:
[    0.821027]  cma_init_reserved_areas+0xc4/0x378
[    0.821443]  do_one_initcall+0x7c/0x1c0
[    0.821860]  kernel_init_freeable+0x1bc/0x284
[    0.822277]  kernel_init+0x24/0x1dc
[    0.822693]  ret_from_fork+0x10/0x20
[    0.823554] Code: 9127a29a cb813321 d37ae421 8b030020 (f8636822)
[    0.823554] ---[ end trace 0000000000000000 ]---
[    0.824360] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[    0.824443] SMP: stopping secondary CPUs
[    0.825193] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---

Should reserved mem check if the reserved memory is actually addressable by
the kernel if it's not "no-map"? Should cma fail gracefully if
!pfn_valid(base_pfn)? Shold early_init_fdt_scan_reserved_mem() be moved
because arm64_bootmem_init()? I don't have the answer to any of those. And
I got a kernel panic because the kernel cannot address that memory (39 bits
VA). I don't know what would happen if the upper limit is reduced for
another reason.
My answer may not be accurate because I don't understand what this upper limit is.
Is this a problem caused by the tag storage area not being included in the memory node ?

The reason for not including it in the memory node is to enable static mte when dmte
initilization fails, right ? I think I missed that. I thought the tag storage is included
in the memory node and registered as cma.
What I think should happen:

1. Add the tag storage memory before any limits are enforced by
arm64_bootmem_init().

2. Call cma_declare_contiguous_nid() after arm64_bootmem_init(), because
the function will check the memory limit.

3. Have an arch initcall that checks that the CMA regions corresponding to
the tag storage have been activated successfully (cma_init_reserved_areas()
is a core initcall). If not, then don't enable tag storage.

How does that sound to you?

Thanks,
Alex
I think this is a good way to utilize the cma code !

Thanks,
Regards.
quoted
quoted
+	ret = tag_storage_of_flat_read_u32(node, "block-size", &block_size_bytes);
+	if (ret || block_size_bytes == 0) {
+		pr_err("Invalid or missing 'block-size' property");
+		return -EINVAL;
+	}
+	region->block_size = get_block_size_pages(block_size_bytes);
+	if (range_len(tag_range) % region->block_size != 0) {
+		pr_err("Tag storage region size 0x%llx is not a multiple of block size %u",
+		       PFN_PHYS(range_len(tag_range)), region->block_size);
+		return -EINVAL;
+	}
+
I was confused about the variable "block_size", The block size declared in the device tree is
in bytes, but the actual block size used is in pages. I think the term "block_size" can cause
confusion as it might be interpreted as bytes. If possible, I suggest changing the term "block_size"
to something more readable, such as "block_nr_pages" (This is just a example!)

Thanks,
Regards.
What do you think about this ?

Thanks,
Regards.
quoted
quoted
+	ret = tag_storage_of_flat_read_u32(mem_node, "numa-node-id", &nid);
+	if (ret)
+		nid = numa_node_id();
+
+	ret = memblock_add_node(PFN_PHYS(tag_range->start), PFN_PHYS(range_len(tag_range)),
+				nid, MEMBLOCK_NONE);
+	if (ret) {
+		pr_err("Error adding tag memblock (%d)", ret);
+		return ret;
+	}
+	memblock_reserve(PFN_PHYS(tag_range->start), PFN_PHYS(range_len(tag_range)));
+
+	pr_info("Found tag storage region 0x%llx-0x%llx, block size %u pages",
+		PFN_PHYS(tag_range->start), PFN_PHYS(tag_range->end), region->block_size);
+
+	num_tag_regions++;
+
+	return 0;
+}
+
+void __init mte_tag_storage_init(void)
+{
+	struct range *tag_range;
+	int i, ret;
+
+	ret = of_scan_flat_dt(fdt_init_tag_storage, NULL);
+	if (ret) {
+		for (i = 0; i < num_tag_regions; i++) {
+			tag_range = &tag_regions[i].tag_range;
+			memblock_remove(PFN_PHYS(tag_range->start), PFN_PHYS(range_len(tag_range)));
+		}
+		num_tag_regions = 0;
+		pr_info("MTE tag storage region management disabled");
+	}
+}
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 417a8a86b2db..1b77138c1aa5 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -42,6 +42,7 @@
 #include <asm/cpufeature.h>
 #include <asm/cpu_ops.h>
 #include <asm/kasan.h>
+#include <asm/mte_tag_storage.h>
 #include <asm/numa.h>
 #include <asm/scs.h>
 #include <asm/sections.h>
@@ -342,6 +343,12 @@ void __init __no_sanitize_address setup_arch(char **cmdline_p)
 			   FW_BUG "Booted with MMU enabled!");
 	}
 
+	/*
+	 * Must be called before memory limits are enforced by
+	 * arm64_memblock_init().
+	 */
+	mte_tag_storage_init();
+
 	arm64_memblock_init();
 
 	paging_init();
-- 
2.42.1

Attachments

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help