Re: [PATCH RFC v6 00/11] vhost: ring format independence
From: Stefano Garzarella <sgarzare@redhat.com>
Date: 2020-06-08 17:30:36
Also in:
kvm, lkml, virtualization
Hi Michael, On Mon, Jun 08, 2020 at 08:52:51AM -0400, Michael S. Tsirkin wrote:
This adds infrastructure required for supporting multiple ring formats. The idea is as follows: we convert descriptors to an independent format first, and process that converting to iov later. Used ring is similar: we fetch into an independent struct first, convert that to IOV later. The point is that we have a tight loop that fetches descriptors, which is good for cache utilization. This will also allow all kind of batching tricks - e.g. it seems possible to keep SMAP disabled while we are fetching multiple descriptors. For used descriptors, this allows keeping track of the buffer length without need to rescan IOV. This seems to perform exactly the same as the original code based on a microbenchmark. Lightly tested. More testing would be very much appreciated.
while testing the vhost-vsock I found some issues in vhost-net (the VM had also a virtio-net device). This is the dmesg of the host (it is a QEMU VM): [ 171.860074] CPU: 0 PID: 16613 Comm: vhost-16595 Not tainted 5.7.0-ste-12703-gaf7b4801030c-dirty #6 [ 171.862210] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014 [ 171.865998] Call Trace: [ 171.866440] <IRQ> [ 171.866817] dump_stack+0x57/0x7a [ 171.867440] nmi_cpu_backtrace.cold+0x14/0x54 [ 171.868233] ? lapic_can_unplug_cpu.cold+0x3b/0x3b [ 171.869153] nmi_trigger_cpumask_backtrace+0x85/0x92 [ 171.870143] arch_trigger_cpumask_backtrace+0x19/0x20 [ 171.871134] rcu_dump_cpu_stacks+0xa0/0xd2 [ 171.872203] rcu_sched_clock_irq.cold+0x23a/0x41c [ 171.873098] update_process_times+0x2c/0x60 [ 171.874119] tick_sched_timer+0x59/0x160 [ 171.874777] ? tick_switch_to_oneshot.cold+0x79/0x79 [ 171.875602] __hrtimer_run_queues+0x10d/0x290 [ 171.876317] hrtimer_interrupt+0x109/0x220 [ 171.877025] smp_apic_timer_interrupt+0x76/0x150 [ 171.877875] apic_timer_interrupt+0xf/0x20 [ 171.878563] </IRQ> [ 171.878897] RIP: 0010:vhost_get_avail_buf+0x5f8/0x860 [vhost] [ 171.879951] Code: 48 8b bb 88 00 00 00 48 85 ff 0f 84 ad 00 00 00 be 01 00 00 00 44 89 45 80 e8 24 52 08 c1 8b 43 68 44 8b 45 80 e9 e9 fb ff ff <45> 85 c0 0f 85 48 fd ff ff 48 8b 43 38 48 83 bb 38 45 00 00 00 48 [ 171.889938] RSP: 0018:ffffc90000397c40 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 [ 171.896828] RAX: 0000000000000040 RBX: ffff88822c3f4688 RCX: ffff888231090000 [ 171.898903] RDX: 0000000000000440 RSI: ffff888231090000 RDI: ffffc90000397c80 [ 171.901025] RBP: ffffc90000397ce8 R08: 0000000000000001 R09: ffffc90000397dc4 [ 171.903136] R10: 000000231edc461f R11: 0000000000000003 R12: 0000000000000001 [ 171.905213] R13: 0000000000000001 R14: ffffc90000397dd4 R15: ffff88822c3f87a8 [ 171.907553] get_tx_bufs+0x49/0x180 [vhost_net] [ 171.909142] handle_tx_copy+0xb4/0x5c0 [vhost_net] [ 171.911495] ? update_curr+0x67/0x160 [ 171.913376] handle_tx+0xb0/0xe0 [vhost_net] [ 171.916451] handle_tx_kick+0x15/0x20 [vhost_net] [ 171.919912] vhost_worker+0xb3/0x110 [vhost] [ 171.923379] kthread+0x106/0x140 [ 171.925314] ? __vhost_add_used_n+0x1c0/0x1c0 [vhost] [ 171.933388] ? kthread_park+0x90/0x90 [ 171.936148] ret_from_fork+0x22/0x30 [ 234.859212] rcu: INFO: rcu_sched self-detected stall on CPU [ 234.860036] rcu: 0-....: (20981 ticks this GP) idle=962/1/0x4000000000000002 softirq=15513/15513 fqs=10340 [ 234.861547] (t=21003 jiffies g=24773 q=2390) [ 234.862158] NMI backtrace for cpu 0 [ 234.862638] CPU: 0 PID: 16613 Comm: vhost-16595 Not tainted 5.7.0-ste-12703-gaf7b4801030c-dirty #6 [ 234.864008] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014 [ 234.866084] Call Trace: [ 234.866395] <IRQ> [ 234.866648] dump_stack+0x57/0x7a [ 234.867079] nmi_cpu_backtrace.cold+0x14/0x54 [ 234.867679] ? lapic_can_unplug_cpu.cold+0x3b/0x3b [ 234.868322] nmi_trigger_cpumask_backtrace+0x85/0x92 [ 234.869013] arch_trigger_cpumask_backtrace+0x19/0x20 [ 234.869747] rcu_dump_cpu_stacks+0xa0/0xd2 [ 234.870267] rcu_sched_clock_irq.cold+0x23a/0x41c [ 234.870960] update_process_times+0x2c/0x60 [ 234.871578] tick_sched_timer+0x59/0x160 [ 234.872148] ? tick_switch_to_oneshot.cold+0x79/0x79 [ 234.872949] __hrtimer_run_queues+0x10d/0x290 [ 234.873711] hrtimer_interrupt+0x109/0x220 [ 234.874271] smp_apic_timer_interrupt+0x76/0x150 [ 234.874913] apic_timer_interrupt+0xf/0x20 [ 234.876507] </IRQ> [ 234.876799] RIP: 0010:vhost_get_avail_buf+0x8a/0x860 [vhost] [ 234.877828] Code: 8d 72 06 00 00 85 c0 0f 85 fb 02 00 00 8b 57 70 89 d0 2d 00 04 00 00 0f 88 72 06 00 00 45 31 c0 4c 8d bb 20 41 00 00 4d 89 ee <44> 0f b7 a3 08 01 00 00 66 44 3b a3 0a 01 00 00 0f 84 58 05 00 00 [ 234.882059] RSP: 0018:ffffc90000397c40 EFLAGS: 00000283 ORIG_RAX: ffffffffffffff13 [ 234.883227] RAX: 0000000000000040 RBX: ffff88822c3f4688 RCX: ffff888231090000 [ 234.884317] RDX: 0000000000000440 RSI: ffff888231090000 RDI: ffffc90000397c80 [ 234.886531] RBP: ffffc90000397ce8 R08: 0000000000000001 R09: ffffc90000397dc4 [ 234.891840] R10: 000000231edc461f R11: 0000000000000003 R12: 0000000000000001 [ 234.896670] R13: 0000000000000001 R14: ffffc90000397dd4 R15: ffff88822c3f87a8 [ 234.900918] get_tx_bufs+0x49/0x180 [vhost_net] [ 234.904280] handle_tx_copy+0xb4/0x5c0 [vhost_net] [ 234.916402] ? update_curr+0x67/0x160 [ 234.917688] handle_tx+0xb0/0xe0 [vhost_net] [ 234.918865] handle_tx_kick+0x15/0x20 [vhost_net] [ 234.920366] vhost_worker+0xb3/0x110 [vhost] [ 234.921500] kthread+0x106/0x140 [ 234.922219] ? __vhost_add_used_n+0x1c0/0x1c0 [vhost] [ 234.923595] ? kthread_park+0x90/0x90 [ 234.924442] ret_from_fork+0x22/0x30 [ 297.870095] rcu: INFO: rcu_sched self-detected stall on CPU [ 297.871352] rcu: 0-....: (36719 ticks this GP) idle=962/1/0x4000000000000002 softirq=15513/15513 fqs=18087 [ 297.873585] (t=36756 jiffies g=24773 q=2853) [ 297.874478] NMI backtrace for cpu 0 [ 297.875229] CPU: 0 PID: 16613 Comm: vhost-16595 Not tainted 5.7.0-ste-12703-gaf7b4801030c-dirty #6 [ 297.877204] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014 [ 297.881644] Call Trace: [ 297.882185] <IRQ> [ 297.882621] dump_stack+0x57/0x7a [ 297.883387] nmi_cpu_backtrace.cold+0x14/0x54 [ 297.884390] ? lapic_can_unplug_cpu.cold+0x3b/0x3b [ 297.885568] nmi_trigger_cpumask_backtrace+0x85/0x92 [ 297.886746] arch_trigger_cpumask_backtrace+0x19/0x20 [ 297.888260] rcu_dump_cpu_stacks+0xa0/0xd2 [ 297.889508] rcu_sched_clock_irq.cold+0x23a/0x41c [ 297.890803] update_process_times+0x2c/0x60 [ 297.893357] tick_sched_timer+0x59/0x160 [ 297.895143] ? tick_switch_to_oneshot.cold+0x79/0x79 [ 297.897832] __hrtimer_run_queues+0x10d/0x290 [ 297.899841] hrtimer_interrupt+0x109/0x220 [ 297.900909] smp_apic_timer_interrupt+0x76/0x150 [ 297.903543] apic_timer_interrupt+0xf/0x20 [ 297.906509] </IRQ> [ 297.908004] RIP: 0010:vhost_get_avail_buf+0x92/0x860 [vhost] [ 297.911536] Code: 85 fb 02 00 00 8b 57 70 89 d0 2d 00 04 00 00 0f 88 72 06 00 00 45 31 c0 4c 8d bb 20 41 00 00 4d 89 ee 44 0f b7 a3 08 01 00 00 <66> 44 3b a3 0a 01 00 00 0f 84 58 05 00 00 8b 43 28 83 e8 01 41 21 [ 297.930274] RSP: 0018:ffffc90000397c40 EFLAGS: 00000283 ORIG_RAX: ffffffffffffff13 [ 297.934056] RAX: 0000000000000040 RBX: ffff88822c3f4688 RCX: ffff888231090000 [ 297.938371] RDX: 0000000000000440 RSI: ffff888231090000 RDI: ffffc90000397c80 [ 297.944222] RBP: ffffc90000397ce8 R08: 0000000000000001 R09: ffffc90000397dc4 [ 297.953817] R10: 000000231edc461f R11: 0000000000000003 R12: 0000000000000001 [ 297.956453] R13: 0000000000000001 R14: ffffc90000397dd4 R15: ffff88822c3f87a8 [ 297.960873] get_tx_bufs+0x49/0x180 [vhost_net] [ 297.964163] handle_tx_copy+0xb4/0x5c0 [vhost_net] [ 297.965871] ? update_curr+0x67/0x160 [ 297.966893] handle_tx+0xb0/0xe0 [vhost_net] [ 297.968442] handle_tx_kick+0x15/0x20 [vhost_net] [ 297.971327] vhost_worker+0xb3/0x110 [vhost] [ 297.974275] kthread+0x106/0x140 [ 297.976141] ? __vhost_add_used_n+0x1c0/0x1c0 [vhost] [ 297.979518] ? kthread_park+0x90/0x90 [ 297.981665] ret_from_fork+0x22/0x30