Re: [Qemu-ppc] pseries on qemu-system-ppc64le crashes in doorbell_core_ipi()
From: Nicholas Piggin <npiggin@gmail.com>
Date: 2019-03-29 09:15:45
Suraj Jitindar Singh's on March 29, 2019 3:20 pm:
On Wed, 2019-03-27 at 17:51 +0100, Cédric Le Goater wrote:quoted
On 3/27/19 5:37 PM, Cédric Le Goater wrote:quoted
On 3/27/19 1:36 PM, Sebastian Andrzej Siewior wrote:quoted
With qemu-system-ppc64le -machine pseries -smp 4 I get:quoted
# chrt 1 hackbench Running in process mode with 10 groups using 40 file descriptors each (== 400 tasks) Each sender will pass 100 messages of 100 bytes Oops: Exception in kernel mode, sig: 4 [#1] LE PAGE_SIZE=64K MMU=Hash PREEMPT SMP NR_CPUS=2048 NUMA pSeries Modules linked in: CPU: 0 PID: 629 Comm: hackbench Not tainted 5.1.0-rc2 #71 NIP: c000000000046978 LR: c000000000046a38 CTR: c0000000000b0150 REGS: c0000001fffeb8e0 TRAP: 0700 Not tainted (5.1.0-rc2) MSR: 8000000000089033 <SF,EE,ME,IR,DR,RI,LE> CR: 42000874 XER: 00000000 CFAR: c000000000046a34 IRQMASK: 1 GPR00: c0000000000b0170 c0000001fffebb70 c000000000a6ba00 0000000028000000…quoted
NIP [c000000000046978] doorbell_core_ipi+0x28/0x30 LR [c000000000046a38] doorbell_try_core_ipi+0xb8/0xf0 Call Trace: [c0000001fffebb70] [c0000001fffebba0] 0xc0000001fffebba0 (unreliable) [c0000001fffebba0] [c0000000000b0170] smp_pseries_cause_ipi+0x20/0x70 [c0000001fffebbd0] [c00000000004b02c] arch_send_call_function_single_ipi+0x8c/0xa0 [c0000001fffebbf0] [c0000000001de600] irq_work_queue_on+0xe0/0x130 [c0000001fffebc30] [c0000000001340c8] rto_push_irq_work_func+0xc8/0x120…quoted
Instruction dump: 60000000 60000000 3c4c00a2 384250b0 3d220009 392949c8 81290000 3929ffff 7d231838 7c0004ac 5463017e 64632800 <7c00191c> 4e800020 3c4c00a2 38425080 ---[ end trace eb842b544538cbdf ]---
This is unusual and causing powerpc code to crash because the rt scheduler is telling irq_work_queue_on to queue work on this CPU. Is that something allowed? There's no warnings in there but it must be a rarely tested path, would it be better to ban it? Steven is this queue_work_on to self by design?
quoted
quoted
quoted
and I was wondering whether this is a qemu bug or the kernel is using an opcode it should rather not. If I skip doorbell_try_core_ipi() in smp_pseries_cause_ipi() then there is no crash. The comment says "POWER9 should not use this handler" so…I would say Linux is using a msgsndp instruction which is not implemented in QEMU TCG. But why have we started using dbells in Linux ?Yeah the kernel must have used msgsndp which isn't implemented for TCG yet. We use doorbells in linux but only for threads which are on the same core. And when I try to construct a situation with more than 1 thread per core (e.g. -smp 4,threads=4), I get "TCG cannot support more than 1 thread/core on a pseries machine". So I wonder why the guest thinks it can use msgsndp...
IPI to self evidently. Under TCG it really should implement the instruction or remove the DBELL feature. Thanks, Nick