Thread (14 messages) 14 messages, 4 authors, 2022-01-31

Re: [PATCH net 1/2] vxlan: do not modify the shared tunnel info when PMTU triggers an ICMP reply

From: Antoine Tenart <atenart@kernel.org>
Date: 2022-01-28 17:01:42
Subsystem: networking [general], the rest · Maintainers: "David S. Miller", Eric Dumazet, Jakub Kicinski, Paolo Abeni, Linus Torvalds

Hi Vlad,

Quoting Vlad Buslov (2022-01-20 13:58:18)
On Thu 20 Jan 2022 at 12:27, Antoine Tenart [off-list ref] wrote:
quoted
Quoting Vlad Buslov (2022-01-20 08:38:05)
quoted
We have been getting memleaks in one of our tests that point to this
code (test deletes vxlan device while running traffic redirected by OvS
TC at the same time):

unreferenced object 0xffff8882d0114200 (size 256):
    [<0000000097659d47>] metadata_dst_alloc+0x1f/0x470
    [<000000007571c30f>] tun_dst_unclone+0xee/0x360 [vxlan]
    [<00000000d2dcfd00>] vxlan_xmit_one+0x131d/0x2a00 [vxlan]
[...]
quoted
quoted
Looking at the code the potential issue seems to be that
tun_dst_unclone() creates new metadata_dst instance with refcount==1,
increments the refcount with dst_hold() to value 2, then returns it.
This seems to imply that caller is expected to release one of the
references (second one if for skb), but none of the callers (including
original dev_fill_metadata_dst()) do that, so I guess I'm
misunderstanding something here.

Any tips or suggestions?
I'd say there is no need to increase the dst refcount here after calling
metadata_dst_alloc, as the metadata is local to the skb and the dst
refcount was already initialized to 1. This might be an issue with
commit fc4099f17240 ("openvswitch: Fix egress tunnel info."); I CCed
Pravin, he might recall if there was a reason to increase the refcount.
I tried to remove the dst_hold(), but that caused underflows[0], so I
guess the current reference counting is required at least for some
use-cases.

[0]:

[  118.803011] dst_release: dst:000000001fc13e61 refcnt:-2                             
[...]

I finally had some time to look at this. Does the diff below fix your
issue?
diff --git a/include/net/dst_metadata.h b/include/net/dst_metadata.h
index 14efa0ded75d..90a7a4daea9c 100644
--- a/include/net/dst_metadata.h
+++ b/include/net/dst_metadata.h
@@ -110,8 +110,8 @@ static inline struct metadata_dst *tun_rx_dst(int md_size)
 static inline struct metadata_dst *tun_dst_unclone(struct sk_buff *skb)
 {
        struct metadata_dst *md_dst = skb_metadata_dst(skb);
-       int md_size;
        struct metadata_dst *new_md;
+       int md_size, ret;
 
        if (!md_dst || md_dst->type != METADATA_IP_TUNNEL)
                return ERR_PTR(-EINVAL);
@@ -123,8 +123,15 @@ static inline struct metadata_dst *tun_dst_unclone(struct sk_buff *skb)
 
        memcpy(&new_md->u.tun_info, &md_dst->u.tun_info,
               sizeof(struct ip_tunnel_info) + md_size);
+#ifdef CONFIG_DST_CACHE
+       ret = dst_cache_init(&new_md->u.tun_info.dst_cache, GFP_ATOMIC);
+       if (ret) {
+               metadata_dst_free(new_md);
+               return ERR_PTR(ret);
+       }
+#endif
+
        skb_dst_drop(skb);
-       dst_hold(&new_md->dst);
        skb_dst_set(skb, &new_md->dst);
        return new_md;
 }
Antoine
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help