Re: [PATCH net-next v7 3/9] tun/tap: add ptr_ring consume helper with netdev queue wakeup
From: Jason Wang <jasowang@redhat.com>
Date: 2026-01-23 03:05:20
Also in:
kvm, lkml, virtualization
On Thu, Jan 22, 2026 at 1:35 PM Jason Wang [off-list ref] wrote:
On Wed, Jan 21, 2026 at 5:33 PM Simon Schippers [off-list ref] wrote:quoted
On 1/9/26 07:02, Jason Wang wrote:quoted
On Thu, Jan 8, 2026 at 3:41 PM Simon Schippers [off-list ref] wrote:quoted
On 1/8/26 04:38, Jason Wang wrote:quoted
On Thu, Jan 8, 2026 at 5:06 AM Simon Schippers [off-list ref] wrote:quoted
Introduce {tun,tap}_ring_consume() helpers that wrap __ptr_ring_consume() and wake the corresponding netdev subqueue when consuming an entry frees space in the underlying ptr_ring. Stopping of the netdev queue when the ptr_ring is full will be introduced in an upcoming commit. Co-developed-by: Tim Gebauer <redacted> Signed-off-by: Tim Gebauer <redacted> Signed-off-by: Simon Schippers <redacted> --- drivers/net/tap.c | 23 ++++++++++++++++++++++- drivers/net/tun.c | 25 +++++++++++++++++++++++-- 2 files changed, 45 insertions(+), 3 deletions(-)diff --git a/drivers/net/tap.c b/drivers/net/tap.c index 1197f245e873..2442cf7ac385 100644 --- a/drivers/net/tap.c +++ b/drivers/net/tap.c@@ -753,6 +753,27 @@ static ssize_t tap_put_user(struct tap_queue *q, return ret ? ret : total; } +static void *tap_ring_consume(struct tap_queue *q) +{ + struct ptr_ring *ring = &q->ring; + struct net_device *dev; + void *ptr; + + spin_lock(&ring->consumer_lock); + + ptr = __ptr_ring_consume(ring); + if (unlikely(ptr && __ptr_ring_consume_created_space(ring, 1))) { + rcu_read_lock(); + dev = rcu_dereference(q->tap)->dev; + netif_wake_subqueue(dev, q->queue_index); + rcu_read_unlock(); + } + + spin_unlock(&ring->consumer_lock); + + return ptr; +} + static ssize_t tap_do_read(struct tap_queue *q, struct iov_iter *to, int noblock, struct sk_buff *skb)@@ -774,7 +795,7 @@ static ssize_t tap_do_read(struct tap_queue *q, TASK_INTERRUPTIBLE); /* Read frames from the queue */ - skb = ptr_ring_consume(&q->ring); + skb = tap_ring_consume(q); if (skb) break; if (noblock) {diff --git a/drivers/net/tun.c b/drivers/net/tun.c index 8192740357a0..7148f9a844a4 100644 --- a/drivers/net/tun.c +++ b/drivers/net/tun.c@@ -2113,13 +2113,34 @@ static ssize_t tun_put_user(struct tun_struct *tun, return total; } +static void *tun_ring_consume(struct tun_file *tfile) +{ + struct ptr_ring *ring = &tfile->tx_ring; + struct net_device *dev; + void *ptr; + + spin_lock(&ring->consumer_lock); + + ptr = __ptr_ring_consume(ring); + if (unlikely(ptr && __ptr_ring_consume_created_space(ring, 1))) {I guess it's the "bug" I mentioned in the previous patch that leads to the check of __ptr_ring_consume_created_space() here. If it's true, another call to tweak the current API.quoted
+ rcu_read_lock(); + dev = rcu_dereference(tfile->tun)->dev; + netif_wake_subqueue(dev, tfile->queue_index);This would cause the producer TX_SOFTIRQ to run on the same cpu which I'm not sure is what we want.What else would you suggest calling to wake the queue?I don't have a good method in my mind, just want to point out its implications.I have to admit I'm a bit stuck at this point, particularly with this aspect. What is the correct way to pass the producer CPU ID to the consumer? Would it make sense to store smp_processor_id() in the tfile inside tun_net_xmit(), or should it instead be stored in the skb (similar to the XDP bit)? In the latter case, my concern is that this information may already be significantly outdated by the time it is used. Based on that, my idea would be for the consumer to wake the producer by invoking a new function (e.g., tun_wake_queue()) on the producer CPU via smp_call_function_single(). Is this a reasonable approach?I'm not sure but it would introduce costs like IPI.quoted
More generally, would triggering TX_SOFTIRQ on the consumer CPU be considered a deal-breaker for the patch set?It depends on whether or not it has effects on the performance. Especially when vhost is pinned.
I meant we can benchmark to see the impact. For example, pin vhost to a specific CPU and the try to see the impact of the TX_SOFTIRQ. Thanks