Re: [PATCH net 1/2] vxlan: do not modify the shared tunnel info when PMTU triggers an ICMP reply
From: Antoine Tenart <atenart@kernel.org>
Date: 2022-01-28 17:01:42
Subsystem:
networking [general], the rest · Maintainers:
"David S. Miller", Eric Dumazet, Jakub Kicinski, Paolo Abeni, Linus Torvalds
Hi Vlad, Quoting Vlad Buslov (2022-01-20 13:58:18)
On Thu 20 Jan 2022 at 12:27, Antoine Tenart [off-list ref] wrote:quoted
Quoting Vlad Buslov (2022-01-20 08:38:05)quoted
We have been getting memleaks in one of our tests that point to this code (test deletes vxlan device while running traffic redirected by OvS TC at the same time): unreferenced object 0xffff8882d0114200 (size 256): [<0000000097659d47>] metadata_dst_alloc+0x1f/0x470 [<000000007571c30f>] tun_dst_unclone+0xee/0x360 [vxlan] [<00000000d2dcfd00>] vxlan_xmit_one+0x131d/0x2a00 [vxlan]
[...]
quoted
quoted
Looking at the code the potential issue seems to be that tun_dst_unclone() creates new metadata_dst instance with refcount==1, increments the refcount with dst_hold() to value 2, then returns it. This seems to imply that caller is expected to release one of the references (second one if for skb), but none of the callers (including original dev_fill_metadata_dst()) do that, so I guess I'm misunderstanding something here. Any tips or suggestions?I'd say there is no need to increase the dst refcount here after calling metadata_dst_alloc, as the metadata is local to the skb and the dst refcount was already initialized to 1. This might be an issue with commit fc4099f17240 ("openvswitch: Fix egress tunnel info."); I CCed Pravin, he might recall if there was a reason to increase the refcount.I tried to remove the dst_hold(), but that caused underflows[0], so I guess the current reference counting is required at least for some use-cases. [0]: [ 118.803011] dst_release: dst:000000001fc13e61 refcnt:-2
[...] I finally had some time to look at this. Does the diff below fix your issue?
diff --git a/include/net/dst_metadata.h b/include/net/dst_metadata.h
index 14efa0ded75d..90a7a4daea9c 100644
--- a/include/net/dst_metadata.h
+++ b/include/net/dst_metadata.h@@ -110,8 +110,8 @@ static inline struct metadata_dst *tun_rx_dst(int md_size) static inline struct metadata_dst *tun_dst_unclone(struct sk_buff *skb) { struct metadata_dst *md_dst = skb_metadata_dst(skb); - int md_size; struct metadata_dst *new_md; + int md_size, ret; if (!md_dst || md_dst->type != METADATA_IP_TUNNEL) return ERR_PTR(-EINVAL);
@@ -123,8 +123,15 @@ static inline struct metadata_dst *tun_dst_unclone(struct sk_buff *skb) memcpy(&new_md->u.tun_info, &md_dst->u.tun_info, sizeof(struct ip_tunnel_info) + md_size); +#ifdef CONFIG_DST_CACHE + ret = dst_cache_init(&new_md->u.tun_info.dst_cache, GFP_ATOMIC); + if (ret) { + metadata_dst_free(new_md); + return ERR_PTR(ret); + } +#endif + skb_dst_drop(skb); - dst_hold(&new_md->dst); skb_dst_set(skb, &new_md->dst); return new_md; }
Antoine