Thread (4 messages) 4 messages, 2 authors, 2016-12-02

Re: [PATCH net] tcp: warn on bogus MSS and try to amend it

From: David Miller <davem@davemloft.net>
Date: 2016-12-01 20:29:51

From: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Date: Wed, 30 Nov 2016 11:14:32 -0200
There have been some reports lately about TCP connection stalls caused
by NIC drivers that aren't setting gso_size on aggregated packets on rx
path. This causes TCP to assume that the MSS is actually the size of the
aggregated packet, which is invalid.

Although the proper fix is to be done at each driver, it's often hard
and cumbersome for one to debug, come to such root cause and report/fix
it.

This patch amends this situation in two ways. First, it adds a warning
on when this situation occurs, so it gives a hint to those trying to
debug this. It also limit the maximum probed MSS to the adverised MSS,
as it should never be any higher than that.

The result is that the connection may not have the best performance ever
but it shouldn't stall, and the admin will have a hint on what to look
for.

Tested with virtio by forcing gso_size to 0.

Cc: Jonathan Maxwell <redacted>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
I totally agree with this change, however I think the warning message can
be improved in two ways:
 	len = skb_shinfo(skb)->gso_size ? : skb->len;
 	if (len >= icsk->icsk_ack.rcv_mss) {
-		icsk->icsk_ack.rcv_mss = len;
+		icsk->icsk_ack.rcv_mss = min_t(unsigned int, len,
+					       tcp_sk(sk)->advmss);
+		if (icsk->icsk_ack.rcv_mss != len)
+			pr_warn_once("Seems your NIC driver is doing bad RX acceleration. TCP performance may be compromised.\n");
We know it's a bad GRO implementation that causes this so let's be specific in the
message, perhaps something like:

	Driver has suspect GRO implementation, TCP performance may be compromised.

Also, we have skb->dev available here most likely, so prefixing the message with
skb->dev->name would make analyzing this situation even easier for someone hitting
this.

I'm not certain if an skb->dev==NULL check is necessary here or not, but it is
definitely something you need to consider.

Thanks!
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help