Thread (6 messages) 6 messages, 2 authors, 2023-10-06

Re: macvtap performs IP defragmentation, causing MTU problems for virtual machines

From: Henrik Lindström <hidden>
Date: 2023-10-02 18:49:44
Also in: lkml
Subsystem: networking drivers, networking [general], networking [ipv4/ipv6], the rest · Maintainers: Andrew Lunn, "David S. Miller", Eric Dumazet, Jakub Kicinski, Paolo Abeni, David Ahern, Ido Schimmel, Linus Torvalds

Had to change "return 0" to "return vif" but other than that your changes
seem to work, even with macvlan defragmentation removed.

I tested it by sending 8K fragmented multicast packets, with 5 macvlans on
the receiving side. I consistently received 6 copies of the packet (1 from the
real interface and 1 per macvlan). While doing this i had my VM running with
a macvtap, and it was receiving fragmented packets as expected.

Here are the changes i was testing with, first time sending a diff over mail
so hope it works :-)
diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index 02bd201bc7e5..5f638433cef9 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -462,10 +462,6 @@ static rx_handler_result_t macvlan_handle_frame(struct sk_buff **pskb)
 	if (is_multicast_ether_addr(eth->h_dest)) {
 		unsigned int hash;
 
-		skb = ip_check_defrag(dev_net(skb->dev), skb, IP_DEFRAG_MACVLAN);
-		if (!skb)
-			return RX_HANDLER_CONSUMED;
-		*pskb = skb;
 		eth = eth_hdr(skb);
 		if (macvlan_forward_source(skb, port, eth->h_source)) {
 			kfree_skb(skb);
diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c
index a4941f53b523..30b822dfa222 100644
--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -479,11 +479,29 @@ static int ip_frag_reasm(struct ipq *qp, struct sk_buff *skb,
 	return err;
 }
 
+static int ip_defrag_vif(const struct sk_buff *skb, const struct net_device *dev)
+{
+	int vif = l3mdev_master_ifindex_rcu(dev);
+
+	if (vif)
+		return vif;
+
+	/* some folks insist that receiving a fragmented mcast dgram on n devices shall
+	 * result in n defragmented packets.
+	 */
+	if (skb->pkt_type == PACKET_BROADCAST || skb->pkt_type == PACKET_MULTICAST) {
+		if (dev)
+			vif = dev->ifindex;
+	}
+
+	return vif;
+}
+
 /* Process an incoming IP datagram fragment. */
 int ip_defrag(struct net *net, struct sk_buff *skb, u32 user)
 {
 	struct net_device *dev = skb->dev ? : skb_dst(skb)->dev;
-	int vif = l3mdev_master_ifindex_rcu(dev);
+	int vif = ip_defrag_vif(skb, dev);
 	struct ipq *qp;
 
 	__IP_INC_STATS(net, IPSTATS_MIB_REASMREQDS);


Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help