Fail to boot KCSAN-enabled kernel (Kernel panic - not syncing: Fatal exception, Unrecoverable FP Unavailable Exception 800 at c0000000022cafe0) on a PowerMac G5, kernel 6.6.1

From: Erhard Furtner <hidden>
Date: 2023-11-13 23:38:01
Also in: lkml

Greetings!

Both my PowerMac G5 and my Talos II (running a BE kernel+system) fail to boot a KCSAN-enabled kernel. Same kernel without KSCAN enabled boots just fine.

I tried to dig a little deeper with a stripped down .config on the G5 with CONFIG_KCSAN=y, CONFIG_KCSAN_STRICT=y and finally got some output via CONFIG_PPC_EARLY_DEBUG_G5=y. The machine freezes before output via serial console or netconsole is available so I had to take screen shots and transcribed them.

On EARLY_DEBUG_G5 the last output (there's some before but it gets overwritten) shown on the screen is:

[c0000000022ebd90] [c0000000000f95c4] __cpuhp_setup_state_cpuslocked+0x1b4/0x590
[c0000000022ebd60] [c0000000000f9ab8] __cpuhp_setup_state+0x118/0x2e8
[c0000000022ebef0] [c000000002023dc8] poking_init+0x3c/0x90
[c0000000022ebf10] [c0000000020059a8] start_kernel+0x6a0/0x99c
[c0000000022ebfe0] [c00000000000cb48] start_here_common+0x1c/0x20
Code: 7fff9830 7ad6f082 3bffffff 7936f00e 7fffa838 7bff1f24 7e96fa15 41820218 7e83a378 482eed09 60000000 <7eb6f82a> 2c350000 41820230 39200001
---[ end trace 0000000000000000 ]---

Kernel panic - not syncing: Fatal exception
Unrecoverable FP Unavailable Exception 800 at c0000000022cafe0
Oops: Unrecoverable FP Unavailable Exception, sig: 6 [#2]
BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Tainted: G      D W
Hardware name: PowerMac11,2 PPC97OMP 0x440101 PowerMac
NIP:  c0000000022cafe0 LR: c000000000046178 CTR: c0000000022cafe0
REGS: c0000000022eb290 TRAP: 0800   Tainted: G      D W           ()
MSR:  9000000000001032 <SF,HV,ME,IR,DR,RI>  CR: 84048882  XER: 00000000
IRQMASK: 3
GPR00: 0000000000000000 c0000000022eb530 c0000000016cb100 fffffffffffffffe
GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR08: 0000000000000000 0000000000000000 0000000000000000 00000000023404df
GPR12: c0000000022cafe0 c000000002388000 000000000014aa88 0000000000000000
GPR16: 00000000ff9aac70 c0000000003593e0 c0000000003584a0 0000000000000009
GPR20: f886000f7ce74a00 0000000000000001 0000000000000000 0000000000000000
GPR24: c0000000022f8ab8 c0000000023404d0 c0000000022cbac0 fffffffffffffffe
GPR28: c0000000023484d8 00000000000f4240 c0000000023404cc c0000000023404c0
NIP [c0000000022cafe0] init_task+0x9e0/0xf00
LR [0000000000046170] __smp_send_nmi_ipi+0x4c0/0x610
Call Trace:
[c0000000022eb530] [0000000000046170] __smp_send_nmi_ipi+0x480/0x610 (unreliable)
[c0000000022eb5c0] [0000000000046b60] smp_send_stop+0x30/0x60
[c0000000022eb5e0] [00000000000f5bb0] panic+0x274/0x554
[c0000000022eb6a0] [000000000002313c] die+0x4bc/0x4c0
[c0000000022eb760] [0000000000062e98] bad_page_fault+0x200/0x2c0
[c0000000022eb7f0] [0000000000063148] do_bad_segment_interrupt+0x58/0xe0
[c0000000022eb828] [0000000000007afc] data_access_slb_common_virt+0x19c/0x1a0
--- interrupt: 380 at hash__map_kernel_page+0x178/0x460
NIP:  c000000000069c48 LR: c000000000069c44 CTR: 0000000000000000
REGS: c0000000022eb850 TRAP: 0380   Tainted: G      D W           ()
MSR:  9000000000001032 <SF,HV,ME,IR,DR,RI>  CR: 44042882  XER: 00000000
IRQMASK: 1
GPR00: 0000000000000000 c0000000022ebaf0 c0000000016cb100 0000000000000000
GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR12: 0000000000000000 c000000002388000 000000000014aa88 0000000000000000
GPR16: 00000000ff9aac70 c0000000003593e0 c0000000003584a0 0000000000000009
GPR20: f886000f7ce74a00 0000000c000cd000 f886000f7ce74a88 0000000000000007
GPR24: 0000000000000009 c0000000023429e0 c0000000023429e8 c0000000022d8d80
GPR28: 800000000000018e 0000000002322000 c0000000023404cc 0000000000000000
NIP [c000000000069c48] hash__map_kernel_page+0x178/0x460
LR [c000000000069c44] hash__map_kernel_page+0x174/0x460
--- interrupt: 380
[c0000000022ebaf0] [c0000000022ebb80] init_stack+0x3b80/0x4000 (unreliable)
[c0000000022ebbd0] [c000000000077d08] text_area_cpu_up+0x78/0x490
[c0000000022ebc80] [c0000000000f6608] cpuhp_invoke_callback+0x218/0x490
[c0000000022ebcf0] [c0000000000f8e28] cpuhp_issue_cal1+0x4c8/0x4f0
[c0000000022ebd90] [c0000000000f95c4] __cpuhp_setup_state_cpuslocked+0x1b4/0x590
[c0000000022ebd60] [c0000000000f9ab8] __cpuhp_setup_state+0x118/0x2e8
[c0000000022ebef0] [c000000002023dc8] poking_init+0x3c/0x90
[c0000000022ebf10] [c0000000020059a8] start_kernel+0x6a0/0x99c
[c0000000022ebfe0] [c00000000000cb48] start_here_common+0x1c/0x20
Code: c0000000 0006f230 00000000 00000000 00000000 c0000004 7fe06608 c0000000 023432a8 00000000 00000001 <c0000000> 022cb020 00000000 00000000
---[ end trace 0000000000000000 ]---

Kernel panic - not syncing: Fatal exception


When I additionally enable CONFIG_KCSAN_SELFTEST=y the machine freezes even earlier and I only get this:

ioremap() called early from iommu_init_early_dart+0x29c/0xb90. Use early_ioremap() instead
DART table allocated at: (____ptrval____)
DART IOMMU initialized for U4 type chipset
Hardware name: PowerMac11,2 PPC970MP 0x440101 PowerMac
CPU maps initialized for 1 thread per core
 (thread shift is 0)
Allocated 2320 bytes for 2 pacas
-----------------------------------------------------
phys_mem_size     = 0x400000000
dcache_bsize      = 0x80
icache_bsize      = 0x80
cpu_features      = 0x00000100900c218a
  possible        = 0x001ffbebfbffb18f
  always          = 0x0000000000000180
cpu_user_features = 0xdc080000 0x00000000
mmu_features      = 0x0c008001
firmware_features = 0x0000000000000000
vmalloc start     = 0xc0003d0000000000
IO start          = 0xc0003e0000000000
vmemmap start     = 0xc0003f0000000000
hash-mmu: ppc64_pft_size    = 0x0
hash-mmu: htab_hash_mask    = 0x1fffff
-----------------------------------------------------
SMU: Driver 0.7 (c) 2005 Benjamin Herrenschmidt, IBM Corp.
ioremap() called early from smu_init+0x5dc/0x840. Use early_ioremap() instead
ioremap() called early from pmac_nvram_init+0x358/0xa3c. Use early_ioremap() instead
nvram: Checking bank 0...
nvram: gen0=1642, gen1=1641
nvram: Active bank is: 0
nvram: OF partition at 0x410
nvram: XP partition at 0x1020
nvram: NR partition at 0x1120
barrier-nospec: using ORI speculation barrier
barrier-nospec: patched 269 locations
Top of RAM: 0x480000000, Total RAM: 0x400000000
Memory hole size: 2048MB
Zone ranges:
  Normal   [mem 0x0000000000000000-0x000000047fffffff]
Movable zone start for each node
Early memory node ranges
  node   0: [mem 0x0000000000000000-0x000000007fffffff]
  node   0: [mem 0x0000000100000000-0x000000047fffffff]
Initmem setup node 0 [mem 0x0000000000000000-0x000000047fffffff]
On node 0, zone Normal: 524288 pages in unavailable ranges
percpu: Embedded 20 pages/cpu s43176 r0 d38744 u524288
pcpu-alloc: s43176 r0 d38744 u524288 alloc=1*1048576
pcpu-alloc: [0] 0 1


Kernel .config attached.

In contrast to the G5 and Talos II KCSAN works ok on my PowerMac G4 DP. So this is probably ppc64 specific?

Regards,
Erhard

Attachments

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help