Re: [syzbot] [kvm?] [net?] [virt?] general protection fault in vhost_work_queue

[syzbot] [kvm?] [net?] [virt?] general protection fault in vhost_work_queue · syzbot <hidden> · 2023-05-30
Re: [syzbot] [kvm?] [net?] [virt?] general protection fault in vhost_work_queue · "Michael S. Tsirkin" <mst@redhat.com> · 2023-05-30
Re: [syzbot] [kvm?] [net?] [virt?] general protection fault in vhost_work_queue · Stefano Garzarella <sgarzare@redhat.com> · 2023-05-30
Re: [syzbot] [kvm?] [net?] [virt?] general protection fault in vhost_work_queue · Mike Christie <michael.christie@oracle.com> · 2023-05-30
Re: [syzbot] [kvm?] [net?] [virt?] general protection fault in vhost_work_queue · Mike Christie <michael.christie@oracle.com> · 2023-05-30
Re: [syzbot] [kvm?] [net?] [virt?] general protection fault in vhost_work_queue · Stefano Garzarella <sgarzare@redhat.com> · 2023-05-30
Re: [syzbot] [kvm?] [net?] [virt?] general protection fault in vhost_work_queue · Stefano Garzarella <sgarzare@redhat.com> · 2023-05-30
Re: [syzbot] [kvm?] [net?] [virt?] general protection fault in vhost_work_queue · Mike Christie <michael.christie@oracle.com> · 2023-05-30
Re: [syzbot] [kvm?] [net?] [virt?] general protection fault in vhost_work_queue · Stefano Garzarella <sgarzare@redhat.com> · 2023-05-30
Re: [syzbot] [kvm?] [net?] [virt?] general protection fault in vhost_work_queue · michael.christie@oracle.com · 2023-05-30
Re: [syzbot] [kvm?] [net?] [virt?] general protection fault in vhost_work_queue · Stefano Garzarella <sgarzare@redhat.com> · 2023-05-31
Re: [syzbot] [kvm?] [net?] [virt?] general protection fault in vhost_work_queue · Mike Christie <michael.christie@oracle.com> · 2023-05-31
Re: [syzbot] [kvm?] [net?] [virt?] general protection fault in vhost_work_queue · Mike Christie <michael.christie@oracle.com> · 2023-05-31
Re: [syzbot] [kvm?] [net?] [virt?] general protection fault in vhost_work_queue · Stefano Garzarella <sgarzare@redhat.com> · 2023-06-01
Re: [syzbot] [kvm?] [net?] [virt?] general protection fault in vhost_work_queue · Mike Christie <michael.christie@oracle.com> · 2023-06-01
Re: [syzbot] [kvm?] [net?] [virt?] general protection fault in vhost_work_queue · Stefano Garzarella <sgarzare@redhat.com> · 2023-06-05

From: Mike Christie <michael.christie@oracle.com>
Date: 2023-06-01 16:33:40
Also in: kvm, lkml, virtualization

On 6/1/23 2:47 AM, Stefano Garzarella wrote:

quoted

static void vhost_worker_free(struct vhost_dev *dev)
{
-    struct vhost_worker *worker = dev->worker;
+    struct vhost_task *vtsk = READ_ONCE(dev->worker.vtsk);

-    if (!worker)
+    if (!vtsk)
        return;

-    dev->worker = NULL;
-    WARN_ON(!llist_empty(&worker->work_list));
-    vhost_task_stop(worker->vtsk);
-    kfree(worker);
+    vhost_task_stop(vtsk);
+    WARN_ON(!llist_empty(&dev->worker.work_list));
+    WRITE_ONCE(dev->worker.vtsk, NULL);

The patch LGTM, I just wonder if we should set dev->worker to zero here,

We might want to just set kcov_handle to zero for now.

In 6.3 and older, I think we could do:

1. vhost_dev_set_owner could successfully set dev->worker.
2. vhost_transport_send_pkt runs vhost_work_queue and sees worker
is set and adds the vhost_work to the work_list.
3. vhost_dev_set_owner fails in vhost_attach_cgroups, so we stop
the worker before the work can be run and set worker to NULL.
4. We clear kcov_handle and return.

We leave the work on the work_list.

5. Userspace can then retry vhost_dev_set_owner. If that works, then the
work gets executed ok eventually.

OR

Userspace can just close the device. vhost_vsock_dev_release would
eventually call vhost_dev_cleanup (vhost_dev_flush won't see a worker
so will just return), and that will hit the WARN_ON but we would
proceed ok.

If I do a memset of the worker, then if userspace were to retry
VHOST_SET_OWNER, we would lose the queued work since the work_list would
get zero'd. I think it's unlikely this ever happens, but you know best
so let me know if this a real issue.

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help