Thread (34 messages) 34 messages, 3 authors, 2024-02-28

Re: KASAN debug kernel fails to boot at early stage when CONFIG_SMP=y is set (kernel 6.5-rc5, PowerMac G4 3,6)

From: Erhard Furtner <hidden>
Date: 2023-09-14 12:34:56

On Thu, 14 Sep 2023 04:54:17 +0000
Christophe Leroy [off-list ref] wrote:
Le 12/09/2023 à 19:39, Christophe Leroy a écrit :
quoted

Le 12/09/2023 à 17:59, Erhard Furtner a écrit :  
quoted
printk: bootconsole [udbg0] enabled
Total memory = 2048MB; using 4096kB for hash table
mapin_ram:125
mmu_mapin_ram:169 0 30000000 1400000 2000000
__mmu_mapin_ram:146 0 1400000
__mmu_mapin_ram:155 1400000
__mmu_mapin_ram:146 1400000 30000000
__mmu_mapin_ram:155 20000000
__mapin_ram_chunk:107 20000000 30000000
__mapin_ram_chunk:117
mapin_ram:134
kasan_mmu_init:129
kasan_mmu_init:132 0
kasan_mmu_init:137
ioremap() called early from btext_map+0x64/0xdc. Use early_ioremap() instead
Linux version 6.6.0-rc1-PMacG4-dirty (root@T1000) (gcc (Gentoo 12.3.1_p20230526 p2) 12.3.1 20230526, GNU ld (Gentoo 2.40 p7) 2.40.0) #5 SMP Tue Sep 12 16:50:47 CEST 2023
kasan_init_region: c0000000 30000000 f8000000 fe000000
kasan_init_region: loop f8000000 fe000000


So I get no "kasan_init_region: setbat" line and don't reach "KASAN init done".  
Ah ok, maybe your CPU only has 4 BATs and they are all used, following
change would tell us.
diff --git a/arch/powerpc/mm/book3s32/mmu.c b/arch/powerpc/mm/book3s32/mmu.c
index 850783cfa9c7..bd26767edce7 100644
--- a/arch/powerpc/mm/book3s32/mmu.c
+++ b/arch/powerpc/mm/book3s32/mmu.c
@@ -86,6 +86,7 @@ int __init find_free_bat(void)
   		if (!(bat[1].batu & 3))
   			return b;
   	}
+	pr_err("NO FREE BAT (%d)\n", n);
   	return -1;
   }

Or you have 8 BATs in which case it's an alignment problem, you need to
increase CONFIG_DATA_SHIFT to 23, for that you need CONFIG_ADVANCED and
CONFIG_DATA_SHIFT_BOOL

But regardless of that there is a problem we need to find out, because
it should work without BATs.

As the BATs allocation fails, it falls back to :

	phys = memblock_phys_alloc_range(k_end - k_start, PAGE_SIZE, 0,
						 MEMBLOCK_ALLOC_ANYWHERE);
		if (!phys)
			return -ENOMEM;
	}

	ret = kasan_init_shadow_page_tables(k_start, k_end);
	if (ret)
		return ret;

	for (k_cur = k_start; k_cur < k_end; k_cur += PAGE_SIZE) {
		pmd_t *pmd = pmd_off_k(k_cur);
		pte_t pte = pfn_pte(PHYS_PFN(phys + k_cur - k_start), PAGE_KERNEL);

		__set_pte_at(&init_mm, k_cur, pte_offset_kernel(pmd, k_cur), pte, 0);
	}
	flush_tlb_kernel_range(k_start, k_end);
	memset(kasan_mem_to_shadow(start), 0, k_end - k_start);


While the __weak function that you confirmed working is:

	ret = kasan_init_shadow_page_tables(k_start, k_end);
	if (ret)
		return ret;

	block = memblock_alloc(k_end - k_start, PAGE_SIZE);
	if (!block)
		return -ENOMEM;

	for (k_cur = k_start & PAGE_MASK; k_cur < k_end; k_cur += PAGE_SIZE) {
		pmd_t *pmd = pmd_off_k(k_cur);
		void *va = block + k_cur - k_start;
		pte_t pte = pfn_pte(PHYS_PFN(__pa(va)), PAGE_KERNEL);

		__set_pte_at(&init_mm, k_cur, pte_offset_kernel(pmd, k_cur), pte, 0);
	}
	flush_tlb_kernel_range(k_start, k_end);


I'm having hard time to understand what's could be wrong at the first place.

Could you try following change:
diff --git a/arch/powerpc/mm/kasan/book3s_32.c
b/arch/powerpc/mm/kasan/book3s_32.c
index 9954b7a3b7ae..e04f21908c6a 100644
--- a/arch/powerpc/mm/kasan/book3s_32.c
+++ b/arch/powerpc/mm/kasan/book3s_32.c
@@ -38,7 +38,7 @@ int __init kasan_init_region(void *start, size_t size)

   	if (k_nobat < k_end) {
   		phys = memblock_phys_alloc_range(k_end - k_nobat, PAGE_SIZE, 0,
-						 MEMBLOCK_ALLOC_ANYWHERE);
+						 MEMBLOCK_ALLOC_ACCESSIBLE);
   		if (!phys)
   			return -ENOMEM;
   	}
And also that one:

diff --git a/arch/powerpc/mm/kasan/init_32.c
b/arch/powerpc/mm/kasan/init_32.c
index a70828a6d935..bc1c075489f4 100644
--- a/arch/powerpc/mm/kasan/init_32.c
+++ b/arch/powerpc/mm/kasan/init_32.c
@@ -84,6 +84,9 @@ kasan_update_early_region(unsigned long k_start,
unsigned long k_end, pte_t pte)
   {
   	unsigned long k_cur;

+	if (k_start == k_end)
+		return;
+
   	for (k_cur = k_start; k_cur != k_end; k_cur += PAGE_SIZE) {
   		pmd_t *pmd = pmd_off_k(k_cur);
   		pte_t *ptep = pte_offset_kernel(pmd, k_cur);


  
I tested the two vmlinux you sent me offlist, they both start without 
problem on QEMU.
For me no problems show up on QEMU either. But QEMU does not seem able to mimic my G4 DPs configuration. That would be a dual CPU G4 + SMP config.
So lets forget that for the moment, allthought you may try with 
CONFIG_STRICT_KERNEL_RWX, in that case you should have enough BATs.
CONFIG_STRICT_KERNEL_RWX=y was enabled all along on my kernel .config. But for comparison I disabled it. If I disable STRICT_KERNEL_RWX I get no output about BATs whatsoever. Details below.
In your last mail you say you tried with all patches. Did it include the 
two above changes ?

If not can you perform the tests with those two changes in addition, 
first one by one then both together depending on the result ?
I think I did apply both but I re-did the checks just to be sure. For my 'all patches applied' config please check the attached git diff.

dmesg with patch 1 "MEMBLOCK_ALLOC_ACCESSIBLE);" applied:

printk: bootconsole [udbg0] enabled
Total memory = 2048MB; using 4096kB for hash table
mapin_ram:125
mmu_mapin_ram:170 0 30000000 1400000 2000000
__mmu_mapin_ram:147 0 1400000
__mmu_mapin_ram:156 1400000
__mmu_mapin_ram:147 1400000 30000000
NO FREE BAT (8)
__mmu_mapin_ram:156 20000000
__mapin_ram_chunk:107 20000000 30000000
__mapin_ram_chunk:117
mapin_ram:134
kasan_mmu_init:129
kasan_mmu_init:132 0
kasan_mmu_init:137
ioremap() called early from btext_map+0x64/0xdc. Use early_ioremap() instead
Linux version 6.6.0-rc1-PMacG4-dirty (root@T1000) (gcc (Gentoo 12.3.1_p20230526 p2) 12.3.1 20230526, GNU ld (Gentoo 2.40 p7) 2.40.0) #23 SMP Thu Sep 14 13:05:23 CEST 2023
kasan_init_region: c0000000 30000000 f8000000 fe000000
NO FREE BAT (8)
kasan_init_region: loop f8000000 fe000000

dmesg with patch 2 "if (k_start == k_end) return;" applied:

printk: bootconsole [udbg0] enabled
Total memory = 2048MB; using 4096kB for hash table
mapin_ram:125
mmu_mapin_ram:170 0 30000000 1400000 2000000
__mmu_mapin_ram:147 0 1400000
__mmu_mapin_ram:156 1400000
__mmu_mapin_ram:147 1400000 30000000
NO FREE BAT (8)
__mmu_mapin_ram:156 20000000
__mapin_ram_chunk:107 20000000 30000000
__mapin_ram_chunk:117
mapin_ram:134
kasan_mmu_init:132
kasan_mmu_init:135 0
kasan_mmu_init:140
ioremap() called early from btext_map+0x64/0xdc. Use early_ioremap() instead
Linux version 6.6.0-rc1-PMacG4-dirty (root@T1000) (gcc (Gentoo 12.3.1_p20230526 p2) 12.3.1 20230526, GNU ld (Gentoo 2.40 p7) 2.40.0) #23 SMP Thu Sep 14 13:05:23 CEST 2023
kasan_init_region: c0000000 30000000 f8000000 fe000000
NO FREE BAT (8)
kasan_init_region: loop f8000000 fe000000

dmesg with both KASAN patches applied:

printk: bootconsole [udbg0] enabled
Total memory = 2048MB; using 4096kB for hash table
mapin_ram:125
mmu_mapin_ram:170 0 30000000 1400000 2000000
__mmu_mapin_ram:147 0 1400000
__mmu_mapin_ram:156 1400000
__mmu_mapin_ram:147 1400000 30000000
NO FREE BAT (8)
__mmu_mapin_ram:156 20000000
__mapin_ram_chunk:107 20000000 30000000
__mapin_ram_chunk:117
mapin_ram:134
kasan_mmu_init:132
kasan_mmu_init:135 0
kasan_mmu_init:140
ioremap() called early from btext_map+0x64/0xdc. Use early_ioremap() instead
Linux version 6.6.0-rc1-PMacG4-dirty (root@T1000) (gcc (Gentoo 12.3.1_p20230526 p2) 12.3.1 20230526, GNU ld (Gentoo 2.40 p7) 2.40.0) #23 SMP Thu Sep 14 13:05:23 CEST 2023
kasan_init_region: c0000000 30000000 f8000000 fe000000
NO FREE BAT (8)
kasan_init_region: loop f8000000 fe000000

dmesg with both KASAN patches and STRICT_KERNEL_RWX=n applied:

printk: bootconsole [udbg0] enabled
Total memory = 2048MB; using 4096kB for hash table
mapin_ram:125
mmu_mapin_ram:170 0 30000000 1400000 2000000
__mmu_mapin_ram:147 0 1400000
__mmu_mapin_ram:156 1400000
__mmu_mapin_ram:147 1400000 30000000
__mmu_mapin_ram:156 20000000
__mapin_ram_chunk:107 20000000 30000000
__mapin_ram_chunk:117
mapin_ram:134
kasan_mmu_init:132
kasan_mmu_init:135 0
kasan_mmu_init:140
Many thanks for your help and perseverance
Christophe
You're welcome! Same to you! :)

Regards,
Erhard

Attachments

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help