[BUG] sched/cache: "Make LLC id continuous" causes NULL cpumask dereference in build_sched_domains on POWER9
From: Venkat Rao Bagalkote <hidden>
Date: 2026-05-25 14:08:22
Also in:
lkml
Greetings!!! I am seeing an early boot kernel panic due to NULL pointer dereference on a POWER9 (pSeries) system when testing linux-next (next-20260522). Traces: [ 0.038567] Big cores detected but using small core scheduling [ 0.038796] BUG: Kernel NULL pointer dereference at 0x00000000 [ 0.038804] Faulting instruction address: 0xc000000000e58504 [ 0.038812] Oops: Kernel access of bad area, sig: 11 [#1] [ 0.038819] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=8192 NUMA pSeries [ 0.038830] Modules linked in: [ 0.038840] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 7.0.0-rc6+ #14 PREEMPTLAZY [ 0.038851] Hardware name: IBM,8375-42A POWER9 (architected) 0x4e0202 0xf000005 of:IBM,FW950.80 (VL950_131) hv:phyp pSeries [ 0.038860] NIP: c000000000e58504 LR: c000000000e58500 CTR: 0000000000000000 [ 0.038869] REGS: c0000000090e78e0 TRAP: 0380 Not tainted (7.0.0-rc6+) [ 0.038878] MSR: 8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE> CR: 44002242 XER: 20040003 [ 0.038907] CFAR: c00000000093f3f0 IRQMASK: 0 [ 0.038907] GPR00: c00000000038b3b8 c0000000090e7b80 c00000000259a800 0000000000000000 [ 0.038907] GPR04: 0000000000000038 0000000000000038 c00000000c6e2560 0000000000000000 [ 0.038907] GPR08: 0000000000000000 0000000000000037 0000ffffffffffff 0000000000000000 [ 0.038907] GPR12: c000000000072730 c0000000051b0000 c00000000c6ee560 00000000ffffffff [ 0.038907] GPR16: 0000000000000000 0000000000000038 c0000000032c6b08 fffffffffffffff6 [ 0.038907] GPR20: 0000000000000000 c000000004d1a6e0 0000000000000000 0000000000000000 [ 0.038907] GPR24: 0000000000000000 0000000000000000 00000000ffffffff c00000000a3bf940 [ 0.038907] GPR28: 0000000000000038 0000000000000000 0000000000000000 0000000000000000 [ 0.039029] NIP [c000000000e58504] _find_first_bit+0x44/0x130 [ 0.039043] LR [c000000000e58500] _find_first_bit+0x40/0x130 [ 0.039054] Call Trace: [ 0.039060] [c0000000090e7b80] [c00000000416af20] schedutil_gov+0x0/0xa0 (unreliable) [ 0.039076] [c0000000090e7bc0] [c00000000038b3b8] build_sched_domains+0xad8/0xe50 [ 0.039089] [c0000000090e7ce0] [c000000003045d78] sched_init_smp+0xa8/0x164 [ 0.039102] [c0000000090e7d30] [c00000000300f374] kernel_init_freeable+0x250/0x370 [ 0.039117] [c0000000090e7de0] [c000000000011f90] kernel_init+0x34/0x1e4 [ 0.039129] [c0000000090e7e50] [c00000000000debc] ret_from_kernel_user_thread+0x14/0x1c [ 0.039142] ---- interrupt: 0 at 0x0 [ 0.039150] Code: 41820090 7c0802a6 393cffff fbe10038 7c7f1b78 fba10028 fbc10030 3bc00000 793dd7e2 f8010050 4bae6e9d 60000000 <e93f0000> 2c290000 408200bc 283c0040 [ 0.039196] ---[ end trace 0000000000000000 ]--- Git bisect is pointing to b5ea300a17e3 sched/cache: Make LLC id continuous as first bad commit. Git Bisect Logs: # git bisect log git bisect start # status: waiting for both good and bad commits # bad: [c1ecb239fa3456529a32255359fc78b69eb9d847] Add linux-next specific files for 20260522 git bisect bad c1ecb239fa3456529a32255359fc78b69eb9d847 # status: waiting for good commit(s), bad commit known # good: [5200f5f493f79f14bbdc349e402a40dfb32f23c8] Linux 7.1-rc4 git bisect good 5200f5f493f79f14bbdc349e402a40dfb32f23c8 # good: [7cd27a0d57b8539366c98bb04fe48d1aff779ea9] Merge branch 'main' of https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git git bisect good 7cd27a0d57b8539366c98bb04fe48d1aff779ea9 # good: [efb3dd6031ec9858c7285fd673970320c86c01f3] Merge branch 'next' of https://git.kernel.org/pub/scm/linux/kernel/git/dtor/input.git git bisect good efb3dd6031ec9858c7285fd673970320c86c01f3 # bad: [1a6066d1c1243fdc5ed464032bbdf12e6710c027] Merge branch 'driver-core-next' of https://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core.git git bisect bad 1a6066d1c1243fdc5ed464032bbdf12e6710c027 # good: [409a99cbc316d912c999fd75b9df042b25900e50] Merge branch 'for-next' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git git bisect good 409a99cbc316d912c999fd75b9df042b25900e50 # bad: [af73f6b022c8c09a3234176892a18216be4cd984] Merge branch 'next' of git://git.kernel.org/pub/scm/virt/kvm/kvm.git git bisect bad af73f6b022c8c09a3234176892a18216be4cd984 # bad: [6a459eb254e4bff61546587eccd3091955123d24] Merge branch into tip/master: 'sched/core' git bisect bad 6a459eb254e4bff61546587eccd3091955123d24 # good: [71ba4bb66c3a9287245d0f5fcfb27d4b951ba402] Merge branch into tip/master: 'locking/core' git bisect good 71ba4bb66c3a9287245d0f5fcfb27d4b951ba402 # good: [f3b45696a160a2230d846de8f706e835984ae65b] Merge branch into tip/master: 'objtool/core' git bisect good f3b45696a160a2230d846de8f706e835984ae65b # bad: [c99b8593b060931c5a0a4b701689f8d6a2c00dbf] sched/cache: Fix stale preferred_llc for a new task git bisect bad c99b8593b060931c5a0a4b701689f8d6a2c00dbf # bad: [5b1d5e6db20a6c64ffb95d04578db8c4b0228eea] sched/cache: Respect LLC preference in task migration and detach git bisect bad 5b1d5e6db20a6c64ffb95d04578db8c4b0228eea # bad: [46afe3af7ead57190b6d362e214814ec804e3b7b] sched/cache: Track LLC-preferred tasks per runqueue git bisect bad 46afe3af7ead57190b6d362e214814ec804e3b7b # good: [f025ef275388742643a2c33f00a0d9c0af3112ee] sched/cache: Record per LLC utilization to guide cache aware scheduling decisions git bisect good f025ef275388742643a2c33f00a0d9c0af3112ee # bad: [b5ea300a17e37eada7a98561fbd34a3054578713] sched/cache: Make LLC id continuous git bisect bad b5ea300a17e37eada7a98561fbd34a3054578713 # good: [23b2b5ccc45ce2a38b9336a916088fffdc4cdfb1] sched/cache: Introduce helper functions to enforce LLC migration policy git bisect good 23b2b5ccc45ce2a38b9336a916088fffdc4cdfb1 # first bad commit: [b5ea300a17e37eada7a98561fbd34a3054578713] sched/cache: Make LLC id continuous b5ea300a17e37eada7a98561fbd34a3054578713 is the first bad commit commit b5ea300a17e37eada7a98561fbd34a3054578713 Author: Tim Chen [off-list ref] Date: Wed Apr 1 14:52:17 2026 -0700 sched/cache: Make LLC id continuous Introduce an index mapping between CPUs and their LLCs. This provides a roughly continuous per LLC index needed for cache-aware load balancing in later patches. The existing per_cpu llc_id usually points to the first CPU of the LLC domain, which is sparse and unsuitable as an array index. Using llc_id directly would waste memory. With the new mapping, CPUs in the same LLC share an approximate continuous id: per_cpu(llc_id, CPU=0...15) = 0 per_cpu(llc_id, CPU=16...31) = 1 per_cpu(llc_id, CPU=32...47) = 2 ... Note that the LLC IDs are allocated via bitmask, so the IDs may be reused during CPU offline->online transitions. Suggested-by: Peter Zijlstra (Intel) [off-list ref] Originally-by: K Prateek Nayak [off-list ref] Co-developed-by: Chen Yu [off-list ref] Signed-off-by: Chen Yu [off-list ref] Signed-off-by: Tim Chen [off-list ref] Signed-off-by: Peter Zijlstra (Intel) [off-list ref] Link: https://patch.msgid.link/047ef46339e4db497b54a89940a7ebedf27fcf28.1775065312.git.tim.c.chen@linux.intel.com kernel/sched/core.c | 2 ++ kernel/sched/sched.h | 3 ++ kernel/sched/topology.c | 90 +++++++++++++++++++++++++++++++++++++++++++++++-- 3 files changed, 93 insertions(+), 2 deletions(-) If you happen to fix this, please add below tag. Reported-by: Venkat Rao Bagalkote <redacted> Regards, Venkat.