Re: [syzbot] [kvm?] [net?] [virt?] general protection fault in vhost_work_queue
From: Mike Christie <michael.christie@oracle.com>
Date: 2023-06-01 16:33:40
Also in:
kvm, lkml, virtualization
On 6/1/23 2:47 AM, Stefano Garzarella wrote:
quoted
static void vhost_worker_free(struct vhost_dev *dev) { - struct vhost_worker *worker = dev->worker; + struct vhost_task *vtsk = READ_ONCE(dev->worker.vtsk); - if (!worker) + if (!vtsk) return; - dev->worker = NULL; - WARN_ON(!llist_empty(&worker->work_list)); - vhost_task_stop(worker->vtsk); - kfree(worker); + vhost_task_stop(vtsk); + WARN_ON(!llist_empty(&dev->worker.work_list)); + WRITE_ONCE(dev->worker.vtsk, NULL);The patch LGTM, I just wonder if we should set dev->worker to zero here,
We might want to just set kcov_handle to zero for now. In 6.3 and older, I think we could do: 1. vhost_dev_set_owner could successfully set dev->worker. 2. vhost_transport_send_pkt runs vhost_work_queue and sees worker is set and adds the vhost_work to the work_list. 3. vhost_dev_set_owner fails in vhost_attach_cgroups, so we stop the worker before the work can be run and set worker to NULL. 4. We clear kcov_handle and return. We leave the work on the work_list. 5. Userspace can then retry vhost_dev_set_owner. If that works, then the work gets executed ok eventually. OR Userspace can just close the device. vhost_vsock_dev_release would eventually call vhost_dev_cleanup (vhost_dev_flush won't see a worker so will just return), and that will hit the WARN_ON but we would proceed ok. If I do a memset of the worker, then if userspace were to retry VHOST_SET_OWNER, we would lose the queued work since the work_list would get zero'd. I think it's unlikely this ever happens, but you know best so let me know if this a real issue.