RE: [PATCH net v2] tun: fix ubuf refcount incorrectly on error path
From: wangyunjian <hidden>
Date: 2020-12-12 06:46:52
-----Original Message----- From: Willem de Bruijn [mailto:willemdebruijn.kernel@gmail.com] Sent: Wednesday, December 9, 2020 10:43 PM To: wangyunjian <redacted> Cc: Michael S. Tsirkin <mst@redhat.com>; Jason Wang [off-list ref]; virtualization@lists.linux-foundation.org; Network Development [off-list ref]; Lilijun (Jerry) [off-list ref]; chenchanghu [off-list ref]; xudingke [off-list ref] Subject: Re: [PATCH net v2] tun: fix ubuf refcount incorrectly on error path On Wed, Dec 9, 2020 at 8:03 AM wangyunjian [off-list ref] wrote:quoted
From: Yunjian Wang <redacted> After setting callback for ubuf_info of skb, the callback (vhost_net_zerocopy_callback) will be called to decrease the refcount when freeing skb. But when an exception occursWith exception, you mean if tun_get_user returns an error that propagates to the sendmsg call in vhost handle_tx, correct?
Yes
quoted
afterwards, the error handling in vhost handle_tx() will try to decrease the same refcount again. This is wrong and fix this by delay copying ubuf_info until we're sure there's no errors.I think the right approach is to address this in the error paths, rather than complicate the normal datapath. Is it sufficient to suppress the call to vhost_net_ubuf_put in the handle_tx sendmsg error path, given that vhost_zerocopy_callback will be called on kfree_skb?
We can not call kfree_skb() until the skb was created.
Or alternatively clear the destructor in drop:
The uarg->callback() is called immediately after we decide do datacopy even if caller want to do zerocopy. If another error occurs later, the vhost handle_tx() will try to decrease it again. Thanks
quoted
Fixes: 4477138fa0ae ("tun: properly test for IFF_UP") Fixes: 90e33d459407 ("tun: enable napi_gro_frags() for TUN/TAP driver") Signed-off-by: Yunjian Wang <redacted> --- v2: Updated code, fix by delay copying ubuf_info --- drivers/net/tun.c | 29 +++++++++++++++++++---------- 1 file changed, 19 insertions(+), 10 deletions(-)diff --git a/drivers/net/tun.c b/drivers/net/tun.c index2dc1988a8973..2ea822328e73 100644--- a/drivers/net/tun.c +++ b/drivers/net/tun.c@@ -1637,6 +1637,20 @@ static struct sk_buff *tun_build_skb(structtun_struct *tun,quoted
return NULL; } +/* copy ubuf_info for callback when skb has no error */ static inline +void tun_copy_ubuf_info(struct sk_buff *skb, bool zerocopy, void +*msg_control) { + if (zerocopy) { + skb_shinfo(skb)->destructor_arg = msg_control; + skb_shinfo(skb)->tx_flags |= SKBTX_DEV_ZEROCOPY; + skb_shinfo(skb)->tx_flags |= SKBTX_SHARED_FRAG; + } else if (msg_control) { + struct ubuf_info *uarg = msg_control; + + uarg->callback(uarg, false); + } +} + /* Get packet from user space buffer */ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile, void *msg_control, struct iov_iter *from,@@ -1812,16 +1826,6 @@ static ssize_t tun_get_user(struct tun_struct*tun, struct tun_file *tfile,quoted
break; } - /* copy skb_ubuf_info for callback when skb has no error */ - if (zerocopy) { - skb_shinfo(skb)->destructor_arg = msg_control; - skb_shinfo(skb)->tx_flags |= SKBTX_DEV_ZEROCOPY; - skb_shinfo(skb)->tx_flags |= SKBTX_SHARED_FRAG; - } else if (msg_control) { - struct ubuf_info *uarg = msg_control; - uarg->callback(uarg, false); - } - skb_reset_network_header(skb); skb_probe_transport_header(skb); skb_record_rx_queue(skb, tfile->queue_index); @@ -1830,6 +1834,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, structtun_file *tfile,quoted
struct bpf_prog *xdp_prog; int ret; + tun_copy_ubuf_info(skb, zerocopy, msg_control); local_bh_disable(); rcu_read_lock(); xdp_prog = rcu_dereference(tun->xdp_prog); @@-1881,6quoted
+1886,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, structtun_file *tfile,quoted
return -ENOMEM; } + tun_copy_ubuf_info(skb, zerocopy, msg_control); local_bh_disable(); napi_gro_frags(&tfile->napi); local_bh_enable();@@ -1889,6 +1895,7 @@ static ssize_t tun_get_user(struct tun_struct *tun,struct tun_file *tfile,quoted
struct sk_buff_head *queue =&tfile->sk.sk_write_queue;quoted
int queue_len; + tun_copy_ubuf_info(skb, zerocopy, msg_control); spin_lock_bh(&queue->lock); __skb_queue_tail(queue, skb); queue_len = skb_queue_len(queue); @@ -1899,8+1906,10quoted
@@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file*tfile, local_bh_enable(); } else if (!IS_ENABLED(CONFIG_4KSTACKS)) { + tun_copy_ubuf_info(skb, zerocopy, msg_control); tun_rx_batched(tun, tfile, skb, more); } else { + tun_copy_ubuf_info(skb, zerocopy, msg_control); netif_rx_ni(skb); } rcu_read_unlock(); -- 2.23.0