Thread (6 messages) 6 messages, 2 authors, 2022-06-27

Re: [PATCH] virtio-net: fix race between ndo_open() and virtio_device_ready()

From: "Michael S. Tsirkin" <mst@redhat.com>
Date: 2022-06-17 12:33:23
Also in: lkml, virtualization

On Fri, Jun 17, 2022 at 07:46:23PM +0800, Jason Wang wrote:
On Fri, Jun 17, 2022 at 6:13 PM Michael S. Tsirkin [off-list ref] wrote:
quoted
On Fri, Jun 17, 2022 at 03:29:49PM +0800, Jason Wang wrote:
quoted
We used to call virtio_device_ready() after netdev registration. This
cause a race between ndo_open() and virtio_device_ready(): if
ndo_open() is called before virtio_device_ready(), the driver may
start to use the device before DRIVER_OK which violates the spec.

Fixing this by switching to use register_netdevice() and protect the
virtio_device_ready() with rtnl_lock() to make sure ndo_open() can
only be called after virtio_device_ready().

Fixes: 4baf1e33d0842 ("virtio_net: enable VQs early")
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 drivers/net/virtio_net.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index db05b5e930be..8a5810bcb839 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -3655,14 +3655,20 @@ static int virtnet_probe(struct virtio_device *vdev)
      if (vi->has_rss || vi->has_rss_hash_report)
              virtnet_init_default_rss(vi);

-     err = register_netdev(dev);
+     /* serialize netdev register + virtio_device_ready() with ndo_open() */
+     rtnl_lock();
+
+     err = register_netdevice(dev);
      if (err) {
              pr_debug("virtio_net: registering device failed\n");
+             rtnl_unlock();
              goto free_failover;
      }

      virtio_device_ready(vdev);

+     rtnl_unlock();
+
      err = virtnet_cpu_notif_add(vi);
      if (err) {
              pr_debug("virtio_net: registering cpu notifier failed\n");

Looks good but then don't we have the same issue when removing the
device?

Actually I looked at  virtnet_remove and I see
        unregister_netdev(vi->dev);

        net_failover_destroy(vi->failover);

        remove_vq_common(vi); <- this will reset the device

a window here?
Probably. For safety, we probably need to reset before unregistering.

careful not to create new races, let's analyse this one to be
sure first.
quoted

Really, I think what we had originally was a better idea -
instead of dropping interrupts they were delayed and
when driver is ready to accept them it just enables them.
The problem is that it works only on some specific setup:

- doesn't work on shared IRQ
- doesn't work on some specific driver e.g virtio-blk
can some core irq work fix that?
quoted
We just need to make sure driver does not wait for
interrupts before enabling them.

And I suspect we need to make this opt-in on a per driver
basis.
Exactly.

Thanks
quoted

quoted
--
2.25.1
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help