Re: [Qemu-ppc] pseries on qemu-system-ppc64le crashes in doorbell_core_ipi()
From: Cédric Le Goater <clg@kaod.org>
Date: 2019-03-27 17:02:53
On 3/27/19 5:37 PM, Cédric Le Goater wrote:
On 3/27/19 1:36 PM, Sebastian Andrzej Siewior wrote:quoted
With qemu-system-ppc64le -machine pseries -smp 4 I get: |# chrt 1 hackbench |Running in process mode with 10 groups using 40 file descriptors each (== 400 tasks) |Each sender will pass 100 messages of 100 bytes | Oops: Exception in kernel mode, sig: 4 [#1] | LE PAGE_SIZE=64K MMU=Hash PREEMPT SMP NR_CPUS=2048 NUMA pSeries | Modules linked in: | CPU: 0 PID: 629 Comm: hackbench Not tainted 5.1.0-rc2 #71 | NIP: c000000000046978 LR: c000000000046a38 CTR: c0000000000b0150 | REGS: c0000001fffeb8e0 TRAP: 0700 Not tainted (5.1.0-rc2) | MSR: 8000000000089033 <SF,EE,ME,IR,DR,RI,LE> CR: 42000874 XER: 00000000 | CFAR: c000000000046a34 IRQMASK: 1 | GPR00: c0000000000b0170 c0000001fffebb70 c000000000a6ba00 0000000028000000 … | NIP [c000000000046978] doorbell_core_ipi+0x28/0x30 | LR [c000000000046a38] doorbell_try_core_ipi+0xb8/0xf0 | Call Trace: | [c0000001fffebb70] [c0000001fffebba0] 0xc0000001fffebba0 (unreliable) | [c0000001fffebba0] [c0000000000b0170] smp_pseries_cause_ipi+0x20/0x70 | [c0000001fffebbd0] [c00000000004b02c] arch_send_call_function_single_ipi+0x8c/0xa0 | [c0000001fffebbf0] [c0000000001de600] irq_work_queue_on+0xe0/0x130 | [c0000001fffebc30] [c0000000001340c8] rto_push_irq_work_func+0xc8/0x120 … | Instruction dump: | 60000000 60000000 3c4c00a2 384250b0 3d220009 392949c8 81290000 3929ffff | 7d231838 7c0004ac 5463017e 64632800 <7c00191c> 4e800020 3c4c00a2 38425080 | ---[ end trace eb842b544538cbdf ]--- and I was wondering whether this is a qemu bug or the kernel is using an opcode it should rather not. If I skip doorbell_try_core_ipi() in smp_pseries_cause_ipi() then there is no crash. The comment says "POWER9 should not use this handler" so…I would say Linux is using a msgsndp instruction which is not implemented in QEMU TCG. But why have we started using dbells in Linux ?
ah. It seems arch_local_irq_restore() / replay_interrupt() generated some interrupt. C.