Thread (23 messages) 23 messages, 10 authors, 2019-12-20

Re: [Qemu-ppc] pseries on qemu-system-ppc64le crashes in doorbell_core_ipi()

From: "Jason A. Donenfeld" <Jason@zx2c4.com>
Date: 2019-12-19 10:50:04

Hi folks,

I'm actually still experiencing this sporadically in the WireGuard test 
suite, which you can see being run on https://build.wireguard.com/ . 
About 50% of the time the powerpc64 build will fail at a place like this:

[   65.147823] Oops: Exception in kernel mode, sig: 4 [#1]
[   65.149198] LE PAGE_SIZE=4K MMU=Hash PREEMPT SMP NR_CPUS=4 NUMA pSeries
[   65.149595] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.5.0-rc1+ #1
[   65.149745] NIP:  c000000000033330 LR: c00000000007eda0 CTR: 
c00000000007ed80
[   65.149934] REGS: c000000000a47970 TRAP: 0700   Not tainted  (5.5.0-rc1+)
[   65.150032] MSR:  800000000288b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> 
CR: 48008288  XER: 00000000
[   65.150352] CFAR: c0000000000332bc IRQMASK: 1
[   65.150352] GPR00: c000000000036508 c000000000a47c00 c000000000a4c100 
0000000000000001
[   65.150352] GPR04: c000000000a50998 0000000000000000 c000000000a50908 
000000000f509000
[   65.150352] GPR08: 0000000028000000 0000000000000000 0000000000000000 
c00000000ff24f00
[   65.150352] GPR12: c00000000007ed80 c000000000ad9000 0000000000000000 
0000000000000000
[   65.150352] GPR16: 00000000008c9190 00000000008c94a8 00000000008c92f8 
00000000008c98b0
[   65.150352] GPR20: 00000000008f2f88 fffffffffffffffd 0000000000000014 
0000000000e6c100
[   65.150352] GPR24: 0000000000e6c100 0000000000000001 0000000000000000 
c000000000a50998
[   65.150352] GPR28: c000000000a9e280 c000000000a50aa4 0000000000000002 
0000000000000000
[   65.151591] NIP [c000000000033330] doorbell_try_core_ipi+0xd0/0xf0
[   65.151750] LR [c00000000007eda0] smp_pseries_cause_ipi+0x20/0x70
[   65.151913] Call Trace:
[   65.152109] [c000000000a47c00] [c0000000000c7c9c] 
_nohz_idle_balance+0xbc/0x300 (unreliable)
[   65.152370] [c000000000a47c30] [c000000000036508] 
smp_send_reschedule+0x98/0xb0
[   65.152711] [c000000000a47c50] [c0000000000c1634] kick_ilb+0x114/0x140
[   65.152962] [c000000000a47ca0] [c0000000000c86d8] 
newidle_balance+0x4e8/0x500
[   65.153213] [c000000000a47d20] [c0000000000c8788] 
pick_next_task_fair+0x48/0x3a0
[   65.153424] [c000000000a47d80] [c000000000466620] __schedule+0xf0/0x430
[   65.153612] [c000000000a47de0] [c000000000466b04] schedule_idle+0x34/0x70
[   65.153786] [c000000000a47e10] [c0000000000c0bc8] do_idle+0x1a8/0x220
[   65.154121] [c000000000a47e70] [c0000000000c0e94] 
cpu_startup_entry+0x34/0x40
[   65.154313] [c000000000a47ea0] [c00000000000ef1c] rest_init+0x10c/0x124
[   65.154414] [c000000000a47ee0] [c000000000500004] 
start_kernel+0x568/0x594
[   65.154585] [c000000000a47f90] [c00000000000a7cc] 
start_here_common+0x1c/0x330
[   65.154854] Instruction dump:
[   65.155191] 38210030 e8010010 7c0803a6 4e800020 3d220004 39295228 
81290000 3929ffff
[   65.155498] 7d284038 7c0004ac 5508017e 65082800 <7c00411c> e94d0178 
812a0000 3929ffff
[   65.156155] ---[ end trace 6180d12e268ffdaf ]---
[   65.185452]
[   66.187490] Kernel panic - not syncing: Fatal exception

This is with "qemu-system-ppc64 -smp 4 -machine pseries" on QEMU 4.0.0.

I'm not totally sure what's going on here. I'm emulating a pseries, and 
using that with qemu's pseries model, and I see that selecting the 
pseries forces the selection of 'config PPC_DOORBELL' (twice in the same 
section, actually). Then inside the kernel there appears to be some 
runtime CPU check for doorbell support. Is this a case in which QEMU is 
advertising doorbell support that TCG doesn't have? Or is something else 
happening here?

Thanks,
Jason
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help