Re: [PATCH] Revert "virtio_pci: Support surprise removal of virtio pci device"
From: "Michael S. Tsirkin" <mst@redhat.com>
Date: 2025-08-22 10:21:59
Also in:
stable
On Fri, Aug 22, 2025 at 12:17:06PM +0300, Parav Pandit wrote:
This reverts commit 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci device").
Virtio drivers and PCI devices have never fully supported true
surprise (aka hot unplug) removal. Drivers historically continued
processing and waiting for pending I/O and even continued synchronous
device reset during surprise removal. Devices have also continued
completing I/Os, doing DMA and allowing device reset after surprise
removal to support such drivers.
Supporting it correctly would require a new device capabilityIf a device is removed, it is removed. Windows drivers supported this since forever and it's just a Linux bug that it does not handle all the cases. This is not something you can handle with a capability.
and driver negotiation in the virtio specification to safely stop I/O and free queue memory. Failure to do so either breaks all the existing drivers with call trace listed in the commit or crashes the host on continuing the DMA.
If the device is gone, then DMA does not continue. IIUC what is going on for you, is that you have developed a surprise removal emulation that pretends to remove the device but actually the device is doing DMA. So of course things break then.
Hence, until such specification and devices
are invented, restore the previous behavior of treating surprise
removal as graceful removal to avoid regressions and maintain system
stability same as before the
commit 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci device").
As explained above, previous analysis of solving this only in driver
was incomplete and non-reliable at [1] and at [2]; Hence reverting commit
43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci device")
is still the best stand to restore failures of virtio net and
block devices.
[1] https://lore.kernel.org/virtualization/CY8PR12MB719506CC5613EB100BC6C638DCBD2@CY8PR12MB7195.namprd12.prod.outlook.com/#t (local)I can only repeat what I said then, this is not how we do kernel development.
[2] https://lore.kernel.org/virtualization/20250602024358.57114-1-parav@nvidia.com/ (local)
What was missing here, is handling corner cases. So let us please try to handle them. Here is how I would try to do it: - add a new driver callback - start a periodic timer task in virtio core on remove - in the timer, probe that the device is still present. if not, invoke a driver callback - cancel the task on device reset If you do not have the time, let me know and I will try to look into it.
quoted hunk ↗ jump to hunk
Fixes: 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci device") Cc: stable@vger.kernel.org Reported-by: lirongqing@baidu.com Closes: https://lore.kernel.org/virtualization/c45dd68698cd47238c55fb73ca9b4741@baidu.com/ (local) Signed-off-by: Parav Pandit <redacted> --- drivers/virtio/virtio_pci_common.c | 7 ------- 1 file changed, 7 deletions(-)diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c index d6d79af44569..dba5eb2eaff9 100644 --- a/drivers/virtio/virtio_pci_common.c +++ b/drivers/virtio/virtio_pci_common.c@@ -747,13 +747,6 @@ static void virtio_pci_remove(struct pci_dev *pci_dev) struct virtio_pci_device *vp_dev = pci_get_drvdata(pci_dev); struct device *dev = get_device(&vp_dev->vdev.dev); - /* - * Device is marked broken on surprise removal so that virtio upper - * layers can abort any ongoing operation. - */ - if (!pci_device_is_present(pci_dev)) - virtio_break_device(&vp_dev->vdev); - pci_disable_sriov(pci_dev); unregister_virtio_device(&vp_dev->vdev);-- 2.26.2