RE: [PATCH net v2] tun: fix ubuf refcount incorrectly on error path

From: wangyunjian <hidden>
Date: 2020-12-14 06:57:40

-----Original Message-----
From: Jason Wang [mailto:jasowang@redhat.com]
Sent: Monday, December 14, 2020 12:07 PM
To: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Cc: wangyunjian <redacted>; Michael S. Tsirkin
[off-list ref]; virtualization@lists.linux-foundation.org; Network
Development [off-list ref]; Lilijun (Jerry)
[off-list ref]; chenchanghu [off-list ref];
xudingke [off-list ref]; huangbin (J)
[off-list ref]; Willem de Bruijn [off-list ref]
Subject: Re: [PATCH net v2] tun: fix ubuf refcount incorrectly on error path


On 2020/12/14 上午11:56, Willem de Bruijn wrote:

quoted

On Sun, Dec 13, 2020 at 10:54 PM Willem de Bruijn
[off-list ref] wrote:

quoted

On Sun, Dec 13, 2020 at 10:30 PM Jason Wang [off-list ref]

wrote:

quoted

On 2020/12/14 上午9:32, Willem de Bruijn wrote:

quoted

On Sat, Dec 12, 2020 at 7:18 PM Willem de Bruijn
[off-list ref] wrote:

quoted

afterwards, the error handling in vhost handle_tx() will try to
decrease the same refcount again. This is wrong and fix this by
delay copying ubuf_info until we're sure there's no errors.

I think the right approach is to address this in the error
paths, rather than complicate the normal datapath.

Is it sufficient to suppress the call to vhost_net_ubuf_put in
the handle_tx sendmsg error path, given that
vhost_zerocopy_callback will be called on kfree_skb?

We can not call kfree_skb() until the skb was created.

quoted

Or alternatively clear the destructor in drop:

The uarg->callback() is called immediately after we decide do
datacopy even if caller want to do zerocopy. If another error
occurs later, the vhost
handle_tx() will try to decrease it again.

Oh right, I missed the else branch in this path:

          /* copy skb_ubuf_info for callback when skb has no error */
          if (zerocopy) {
                  skb_shinfo(skb)->destructor_arg = msg_control;
                  skb_shinfo(skb)->tx_flags |=

SKBTX_DEV_ZEROCOPY;

quoted

                  skb_shinfo(skb)->tx_flags |=

SKBTX_SHARED_FRAG;

quoted

          } else if (msg_control) {
                  struct ubuf_info *uarg = msg_control;
                  uarg->callback(uarg, false);
          }

So if handle_tx_zerocopy calls tun_sendmsg with ubuf_info (and
thus a reference to release), there are these five options:

1. tun_sendmsg succeeds, ubuf_info is associated with skb.
       reference released from kfree_skb calling
vhost_zerocopy_callback later

2. tun_sendmsg succeeds, ubuf_info is released immediately, as skb
is not zerocopy.

3. tun_sendmsg fails before creating skb, handle_tx_zerocopy
correctly cleans up on receiving error from tun_sendmsg.

4. tun_sendmsg fails after creating skb, but with copying:
decremented at branch shown above + again in handle_tx_zerocopy

5. tun_sendmsg fails after creating skb, with zerocopy:
decremented at kfree_skb in drop: + again in handle_tx_zerocopy

Since handle_tx_zerocopy has no idea whether on error 3, 4 or 5
occurred,

Actually, it does. If sendmsg returns an error, it can test whether
vq->heads[nvq->upend_idx].len != VHOST_DMA_IN_PROGRESS.

Just to make sure I understand this. Any reason for it can't be
VHOST_DMA_IN_PROGRESS here?

It can be, and it will be if tun_sendmsg returns EINVAL before
assigning the skb destructor.

I meant returns an error, not necessarily only EINVAL.

quoted

Only if tun_sendmsg released the zerocopy state through
kfree_skb->vhost_zerocopy_callback will it have been updated to
VHOST_DMA_DONE_LEN. And only then must the caller not try to release
the state again.


I see. So I tend to fix this in vhost instead of tun to be consistent with the
current error handling in handle_tx_zerocopy().

Agree, thanks for the suggestion. 
I'll send v3 patch according to your comments.

Thanks

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help