Re: [PATCH net-next] ibmveth: v1 calculate correct gso_size and set gso_type
From: Brian King <hidden>
Date: 2016-11-11 18:17:32
Also in:
lkml, netdev
On 10/27/2016 10:26 AM, Eric Dumazet wrote:
On Wed, 2016-10-26 at 11:09 +1100, Jon Maxwell wrote:quoted
We recently encountered a bug where a few customers using ibmveth on the same LPAR hit an issue where a TCP session hung when large receive was enabled. Closer analysis revealed that the session was stuck because the one side was advertising a zero window repeatedly. We narrowed this down to the fact the ibmveth driver did not set gso_size which is translated by TCP into the MSS later up the stack. The MSS is used to calculate the TCP window size and as that was abnormally large, it was calculating a zero window, even although the sockets receive buffer was completely empty. We were able to reproduce this and worked with IBM to fix this. Thanks Tom and Marcelo for all your help and review on this. The patch fixes both our internal reproduction tests and our customers tests. Signed-off-by: Jon Maxwell <redacted> --- drivers/net/ethernet/ibm/ibmveth.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+)diff --git a/drivers/net/ethernet/ibm/ibmveth.c b/drivers/net/ethernet/ibm/ibmveth.c index 29c05d0..c51717e 100644 --- a/drivers/net/ethernet/ibm/ibmveth.c +++ b/drivers/net/ethernet/ibm/ibmveth.c@@ -1182,6 +1182,8 @@ static int ibmveth_poll(struct napi_struct *napi, int budget) int frames_processed = 0; unsigned long lpar_rc; struct iphdr *iph; + bool large_packet = 0; + u16 hdr_len = ETH_HLEN + sizeof(struct tcphdr); restart_poll: while (frames_processed < budget) {@@ -1236,10 +1238,28 @@ static int ibmveth_poll(struct napi_struct *napi, int budget) iph->check = 0; iph->check = ip_fast_csum((unsigned char *)iph, iph->ihl); adapter->rx_large_packets++; + large_packet = 1; } } } + if (skb->len > netdev->mtu) { + iph = (struct iphdr *)skb->data; + if (be16_to_cpu(skb->protocol) == ETH_P_IP && + iph->protocol == IPPROTO_TCP) { + hdr_len += sizeof(struct iphdr); + skb_shinfo(skb)->gso_type = SKB_GSO_TCPV4; + skb_shinfo(skb)->gso_size = netdev->mtu - hdr_len; + } else if (be16_to_cpu(skb->protocol) == ETH_P_IPV6 && + iph->protocol == IPPROTO_TCP) { + hdr_len += sizeof(struct ipv6hdr); + skb_shinfo(skb)->gso_type = SKB_GSO_TCPV6; + skb_shinfo(skb)->gso_size = netdev->mtu - hdr_len; + } + if (!large_packet) + adapter->rx_large_packets++; + } +This might break forwarding and PMTU discovery. You force gso_size to device mtu, regardless of real MSS used by the TCP sender. Don't you have the MSS provided in RX descriptor, instead of guessing the value ?
Eric, We are currently pursuing making changes to the Power Virtual I/O Server to provide the MSS to the ibmveth driver. However, this will take time to go through test and ultimately get released. Although imperfect, this patch does help a real customer hitting this issue right now. Would you object to this patch getting merged as is, with the understanding that when we get the change in the Virtual I/O Server released, we will revert this interim change and apply the new method? Thanks, Brian -- Brian King Power Linux I/O IBM Linux Technology Center