Thread (34 messages) 34 messages, 3 authors, 2024-02-28

Re: KASAN debug kernel fails to boot at early stage when CONFIG_SMP=y is set (kernel 6.5-rc5, PowerMac G4 3,6)

From: Christophe Leroy <hidden>
Date: 2023-09-14 04:55:36


Le 12/09/2023 à 19:39, Christophe Leroy a écrit :
quoted hunk ↗ jump to hunk

Le 12/09/2023 à 17:59, Erhard Furtner a écrit :
quoted
printk: bootconsole [udbg0] enabled
Total memory = 2048MB; using 4096kB for hash table
mapin_ram:125
mmu_mapin_ram:169 0 30000000 1400000 2000000
__mmu_mapin_ram:146 0 1400000
__mmu_mapin_ram:155 1400000
__mmu_mapin_ram:146 1400000 30000000
__mmu_mapin_ram:155 20000000
__mapin_ram_chunk:107 20000000 30000000
__mapin_ram_chunk:117
mapin_ram:134
kasan_mmu_init:129
kasan_mmu_init:132 0
kasan_mmu_init:137
ioremap() called early from btext_map+0x64/0xdc. Use early_ioremap() instead
Linux version 6.6.0-rc1-PMacG4-dirty (root@T1000) (gcc (Gentoo 12.3.1_p20230526 p2) 12.3.1 20230526, GNU ld (Gentoo 2.40 p7) 2.40.0) #5 SMP Tue Sep 12 16:50:47 CEST 2023
kasan_init_region: c0000000 30000000 f8000000 fe000000
kasan_init_region: loop f8000000 fe000000


So I get no "kasan_init_region: setbat" line and don't reach "KASAN init done".
Ah ok, maybe your CPU only has 4 BATs and they are all used, following
change would tell us.
diff --git a/arch/powerpc/mm/book3s32/mmu.c b/arch/powerpc/mm/book3s32/mmu.c
index 850783cfa9c7..bd26767edce7 100644
--- a/arch/powerpc/mm/book3s32/mmu.c
+++ b/arch/powerpc/mm/book3s32/mmu.c
@@ -86,6 +86,7 @@ int __init find_free_bat(void)
   		if (!(bat[1].batu & 3))
   			return b;
   	}
+	pr_err("NO FREE BAT (%d)\n", n);
   	return -1;
   }

Or you have 8 BATs in which case it's an alignment problem, you need to
increase CONFIG_DATA_SHIFT to 23, for that you need CONFIG_ADVANCED and
CONFIG_DATA_SHIFT_BOOL

But regardless of that there is a problem we need to find out, because
it should work without BATs.

As the BATs allocation fails, it falls back to :

	phys = memblock_phys_alloc_range(k_end - k_start, PAGE_SIZE, 0,
						 MEMBLOCK_ALLOC_ANYWHERE);
		if (!phys)
			return -ENOMEM;
	}

	ret = kasan_init_shadow_page_tables(k_start, k_end);
	if (ret)
		return ret;

	for (k_cur = k_start; k_cur < k_end; k_cur += PAGE_SIZE) {
		pmd_t *pmd = pmd_off_k(k_cur);
		pte_t pte = pfn_pte(PHYS_PFN(phys + k_cur - k_start), PAGE_KERNEL);

		__set_pte_at(&init_mm, k_cur, pte_offset_kernel(pmd, k_cur), pte, 0);
	}
	flush_tlb_kernel_range(k_start, k_end);
	memset(kasan_mem_to_shadow(start), 0, k_end - k_start);


While the __weak function that you confirmed working is:

	ret = kasan_init_shadow_page_tables(k_start, k_end);
	if (ret)
		return ret;

	block = memblock_alloc(k_end - k_start, PAGE_SIZE);
	if (!block)
		return -ENOMEM;

	for (k_cur = k_start & PAGE_MASK; k_cur < k_end; k_cur += PAGE_SIZE) {
		pmd_t *pmd = pmd_off_k(k_cur);
		void *va = block + k_cur - k_start;
		pte_t pte = pfn_pte(PHYS_PFN(__pa(va)), PAGE_KERNEL);

		__set_pte_at(&init_mm, k_cur, pte_offset_kernel(pmd, k_cur), pte, 0);
	}
	flush_tlb_kernel_range(k_start, k_end);


I'm having hard time to understand what's could be wrong at the first place.

Could you try following change:
diff --git a/arch/powerpc/mm/kasan/book3s_32.c
b/arch/powerpc/mm/kasan/book3s_32.c
index 9954b7a3b7ae..e04f21908c6a 100644
--- a/arch/powerpc/mm/kasan/book3s_32.c
+++ b/arch/powerpc/mm/kasan/book3s_32.c
@@ -38,7 +38,7 @@ int __init kasan_init_region(void *start, size_t size)

   	if (k_nobat < k_end) {
   		phys = memblock_phys_alloc_range(k_end - k_nobat, PAGE_SIZE, 0,
-						 MEMBLOCK_ALLOC_ANYWHERE);
+						 MEMBLOCK_ALLOC_ACCESSIBLE);
   		if (!phys)
   			return -ENOMEM;
   	}
And also that one:

diff --git a/arch/powerpc/mm/kasan/init_32.c
b/arch/powerpc/mm/kasan/init_32.c
index a70828a6d935..bc1c075489f4 100644
--- a/arch/powerpc/mm/kasan/init_32.c
+++ b/arch/powerpc/mm/kasan/init_32.c
@@ -84,6 +84,9 @@ kasan_update_early_region(unsigned long k_start,
unsigned long k_end, pte_t pte)
   {
   	unsigned long k_cur;

+	if (k_start == k_end)
+		return;
+
   	for (k_cur = k_start; k_cur != k_end; k_cur += PAGE_SIZE) {
   		pmd_t *pmd = pmd_off_k(k_cur);
   		pte_t *ptep = pte_offset_kernel(pmd, k_cur);

I tested the two vmlinux you sent me offlist, they both start without 
problem on QEMU.

Regarding the use of BATs, in fact a shift of 23 is still not enough to 
get free BATs for KASAN. But at least it allows you to map all linear 
mem with BATS whereas a shift of 22 would require 9 BATs :

With shift 22 you have BATs with size : 4+4+8+16+32+64+128+256+256
With shift 23 you have BATs with size : 8+8+16+32+64+128+256+256

So lets forget that for the moment, allthought you may try with 
CONFIG_STRICT_KERNEL_RWX, in that case you should have enough BATs.

But lets try to refocus on the real problem.

In your last mail you say you tried with all patches. Did it include the 
two above changes ?

If not can you perform the tests with those two changes in addition, 
first one by one then both together depending on the result ?

Many thanks for your help and perseverance
Christophe
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help