Re: [RFC 7/11] virtio_pci: new, capability-aware driver.
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Date: 2012-01-13 02:32:42
On Fri, 2012-01-13 at 04:19 +0200, Michael S. Tsirkin wrote:
On Thu, Jan 12, 2012 at 12:12:17PM +1030, Rusty Russell wrote:quoted
On Thu, 12 Jan 2012 00:02:33 +0200, "Michael S. Tsirkin" [off-list ref] wrote:quoted
Look, we have a race currently. Let us not tie a bug fix to a huge rewrite with unclear performance benefits, please.In theory, yes. In practice, we bandaid it. I think in the short term we change ->get to get the entire sequence twice, and check it's the same. Theoretically, still racy, but it does cut the window. And we haven't seen the bug yet, either.I thought about this some more. Since we always get an interrupt on config changes, it seems that a rather robust method would be to just synchronize against that. Something like the below (warning - completely untested). Still need to think about memory barriers, overflow etc. What do you think?
Your interrupt may take an unpredictable amount of time to arrive, I don't see how you can use that as a guarantee. Cheers, Ben.
quoted hunk ↗ jump to hunk
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>diff --git a/drivers/virtio/virtio_pci.c b/drivers/virtio/virtio_pci.c index 03d1984..b5df385 100644 --- a/drivers/virtio/virtio_pci.c +++ b/drivers/virtio/virtio_pci.c@@ -57,6 +57,7 @@ struct virtio_pci_device unsigned msix_used_vectors; /* Whether we have vector per vq */ bool per_vq_vectors; + atomic_t config_changes; }; /* Constants for MSI-X */@@ -125,6 +126,19 @@ static void vp_finalize_features(struct virtio_device *vdev) iowrite32(vdev->features[0], vp_dev->ioaddr+VIRTIO_PCI_GUEST_FEATURES); } +/* wait for pending irq handlers */ +static void vp_synchronize_vectors(struct virtio_device *vdev) +{ + struct virtio_pci_device *vp_dev = to_vp_device(vdev); + int i; + + if (vp_dev->intx_enabled) + synchronize_irq(vp_dev->pci_dev->irq); + + for (i = 0; i < vp_dev->msix_vectors; ++i) + synchronize_irq(vp_dev->msix_entries[i].vector); +} + /* virtio config->get() implementation */ static void vp_get(struct virtio_device *vdev, unsigned offset, void *buf, unsigned len)@@ -134,9 +148,20 @@ static void vp_get(struct virtio_device *vdev, unsigned offset, VIRTIO_PCI_CONFIG(vp_dev) + offset; u8 *ptr = buf; int i; - - for (i = 0; i < len; i++) - ptr[i] = ioread8(ioaddr + i); + int uninitialized_var(c); + c = atomic_read(&vp_dev->config_changes); + /* Make sure read is done before we get the first config byte */ + rmb(); + do { + for (i = 0; i < len; i++) + ptr[i] = ioread8(ioaddr + i); + /* Synchronize with config interrupt */ + vp_synchronize_vectors(vdev); + /* + * For multi-byte fields, we might get a config change interrupt + * between byte reads. If this happens, retry the read. + */ + } while (c != atomic_read(&vp_dev->config_changes)) } /* the config->set() implementation. it's symmetric to the config->get()@@ -169,19 +194,6 @@ static void vp_set_status(struct virtio_device *vdev, u8 status) iowrite8(status, vp_dev->ioaddr + VIRTIO_PCI_STATUS); } -/* wait for pending irq handlers */ -static void vp_synchronize_vectors(struct virtio_device *vdev) -{ - struct virtio_pci_device *vp_dev = to_vp_device(vdev); - int i; - - if (vp_dev->intx_enabled) - synchronize_irq(vp_dev->pci_dev->irq); - - for (i = 0; i < vp_dev->msix_vectors; ++i) - synchronize_irq(vp_dev->msix_entries[i].vector); -} - static void vp_reset(struct virtio_device *vdev) { struct virtio_pci_device *vp_dev = to_vp_device(vdev);@@ -213,6 +225,8 @@ static irqreturn_t vp_config_changed(int irq, void *opaque) drv = container_of(vp_dev->vdev.dev.driver, struct virtio_driver, driver); + atomic_inc(&vp_dev->config_changes); + if (drv && drv->config_changed) drv->config_changed(&vp_dev->vdev); return IRQ_HANDLED;@@ -646,6 +660,7 @@ static int __devinit virtio_pci_probe(struct pci_dev *pci_dev, vp_dev->vdev.config = &virtio_pci_config_ops; vp_dev->pci_dev = pci_dev; INIT_LIST_HEAD(&vp_dev->virtqueues); + atomic_set(&vp_dev->config_changes, 0); spin_lock_init(&vp_dev->lock); /* Disable MSI/MSIX to bring device to a known good state. */