Re: [PATCH 6/6] vhost_net: use RCU callbacks instead of synchronize_rcu()
From: Jason Wang <jasowang@redhat.com>
Date: 2021-11-16 05:01:46
Also in:
bpf, kvm, lkml, virtualization
On Mon, Nov 15, 2021 at 11:32 PM Andrey Ryabinin [off-list ref] wrote:
Currently vhost_net_release() uses synchronize_rcu() to synchronize
freeing with vhost_zerocopy_callback(). However synchronize_rcu()
is quite costly operation. It take more than 10 seconds
to shutdown qemu launched with couple net devices like this:
-netdev tap,id=tap0,..,vhost=on,queues=80
because we end up calling synchronize_rcu() netdev_count*queues times.
Free vhost net structures in rcu callback instead of using
synchronize_rcu() to fix the problem.
I admit the release code is somehow hard to understand. But I wonder
if the following case can still happen with this:
CPU 0 (vhost_dev_cleanup) CPU1
(vhost_net_zerocopy_callback()->vhost_work_queue())
if (!dev->worker)
dev->worker = NULL
wake_up_process(dev->worker)
If this is true. It seems the fix is to move RCU synchronization stuff
in vhost_net_ubuf_put_and_wait()?
Thanks
quoted hunk ↗ jump to hunk
Signed-off-by: Andrey Ryabinin <redacted> --- drivers/vhost/net.c | 22 ++++++++++++++-------- 1 file changed, 14 insertions(+), 8 deletions(-)diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index 97a209d6a527..0699d30e83d5 100644 --- a/drivers/vhost/net.c +++ b/drivers/vhost/net.c@@ -132,6 +132,7 @@ struct vhost_net { struct vhost_dev dev; struct vhost_net_virtqueue vqs[VHOST_NET_VQ_MAX]; struct vhost_poll poll[VHOST_NET_VQ_MAX]; + struct rcu_head rcu; /* Number of TX recently submitted. * Protected by tx vq lock. */ unsigned tx_packets;@@ -1389,6 +1390,18 @@ static void vhost_net_flush(struct vhost_net *n) } } +static void vhost_net_free(struct rcu_head *rcu_head) +{ + struct vhost_net *n = container_of(rcu_head, struct vhost_net, rcu); + + kfree(n->vqs[VHOST_NET_VQ_RX].rxq.queue); + kfree(n->vqs[VHOST_NET_VQ_TX].xdp); + kfree(n->dev.vqs); + if (n->page_frag.page) + __page_frag_cache_drain(n->page_frag.page, n->refcnt_bias); + kvfree(n); +} + static int vhost_net_release(struct inode *inode, struct file *f) { struct vhost_net *n = f->private_data;@@ -1404,15 +1417,8 @@ static int vhost_net_release(struct inode *inode, struct file *f) sockfd_put(tx_sock); if (rx_sock) sockfd_put(rx_sock); - /* Make sure no callbacks are outstanding */ - synchronize_rcu(); - kfree(n->vqs[VHOST_NET_VQ_RX].rxq.queue); - kfree(n->vqs[VHOST_NET_VQ_TX].xdp); - kfree(n->dev.vqs); - if (n->page_frag.page) - __page_frag_cache_drain(n->page_frag.page, n->refcnt_bias); - kvfree(n); + call_rcu(&n->rcu, vhost_net_free); return 0; } --2.32.0