Thread (9 messages) 9 messages, 3 authors, 2022-06-30

Re: [PATCH V2] virtio-net: fix the race between refill work and close

From: Jason Wang <jasowang@redhat.com>
Date: 2022-06-30 06:15:54
Also in: lkml, virtualization

On Thu, Jun 30, 2022 at 2:07 PM Jason Wang [off-list ref] wrote:
On Thu, Jun 30, 2022 at 10:51 AM Jakub Kicinski [off-list ref] wrote:
quoted
On Thu, 30 Jun 2022 10:08:04 +0800 Jason Wang wrote:
quoted
+static void enable_refill_work(struct virtnet_info *vi)
+{
+     spin_lock(&vi->refill_lock);
+     vi->refill_work_enabled = true;
+     spin_unlock(&vi->refill_lock);
+}
+
+static void disable_refill_work(struct virtnet_info *vi)
+{
+     spin_lock(&vi->refill_lock);
+     vi->refill_work_enabled = false;
+     spin_unlock(&vi->refill_lock);
+}
+
 static void virtqueue_napi_schedule(struct napi_struct *napi,
                                  struct virtqueue *vq)
 {
@@ -1527,8 +1547,12 @@ static int virtnet_receive(struct receive_queue *rq, int budget,
      }

      if (rq->vq->num_free > min((unsigned int)budget, virtqueue_get_vring_size(rq->vq)) / 2) {
-             if (!try_fill_recv(vi, rq, GFP_ATOMIC))
-                     schedule_delayed_work(&vi->refill, 0);
+             if (!try_fill_recv(vi, rq, GFP_ATOMIC)) {
+                     spin_lock(&vi->refill_lock);
+                     if (vi->refill_work_enabled)
+                             schedule_delayed_work(&vi->refill, 0);
+                     spin_unlock(&vi->refill_lock);
Are you sure you can use the basic spin_lock() flavor in all cases?
Isn't the disable/enable called from a different context than this
thing here?
This function will only be called in bh so it's safe.
Ok, so it looks like we should use the bh variant in close. Otherwise
we may have a deadlock. Will fix it.

Thanks
quoted
The entire delayed work construct seems a little risky because the work
may go to sleep after disabling napi, causing large latency spikes.
Yes, but it only happens on OOM.
quoted
I guess you must have a good reason no to simply reschedule the NAPI
and keep retrying with GFP_ATOMIC...
Less pressure on the memory allocator on OOM probably, but it looks
like an independent issue that might be optimized in the future.
quoted
Please add the target tree name to the subject.
Ok

Thanks
quoted
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help