Thread (8 messages) 8 messages, 2 authors, 2021-02-26

Re: Spurious TCP retransmissions on ack vs kfree_skb reordering

From: Eric Dumazet <edumazet@google.com>
Date: 2021-02-26 10:42:39

On Fri, Feb 26, 2021 at 11:05 AM Eric Dumazet [off-list ref] wrote:
On Fri, Feb 26, 2021 at 4:15 AM Jakub Kicinski [off-list ref] wrote:
quoted
On Thu, 25 Feb 2021 15:25:15 -0800 Jakub Kicinski wrote:
quoted
Hi!

We see large (4-8x) increase of what looks like TCP RTOs after rising
the Tx coalescing above Rx coalescing timeout.

Quick tracing of the events seems to indicate that the data has already
been acked when we enter tcp:tcp_retransmit_skb:
Seems like I'm pretty lost here and the tcp:tcp_retransmit_skb events
are less spurious than I thought. Looking at some tcpdump traces we see:

0.045277 IP6 A > B: Flags [SEW], seq 2248382925:2248383296, win 61920, options [mss 1440,sackOK,TS val 658870494 ecr 0,nop,wscale 11], length 371

0.045348 IP6 B > A: Flags [S.E], seq 961169456, ack 2248382926, win 65535, options [mss 1440,sackOK,TS val 883864022 ecr 658870494,nop,wscale 9], length 0
The SYNACK does not include the prior payload.
quoted
0.045369 IP6 A > B: Flags [P.], seq 1:372, ack 1, win 31, options [nop,nop,TS val 658870494 ecr 883864022], length 371
So this rtx is not spurious.

However in your prior email you wrote :

bytes_in:      0
bytes_out:   742
bytes_acked: 742

Are you sure that at the time of the retransmit, bytes_acked was 742 ?
I do not see how this could happen.
Yes, this packetdrill test confirms TCP INFO stuff seems correct .

//
// SYN-data are not fully acknowledged
//
`../../common/defaults.sh
 sysctl -wq net.ipv4.tcp_timestamps=0 `

// Cache warmup: send a Fast Open cookie request
    0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
   +0 fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0
   +0 sendto(3, ..., 0, MSG_FASTOPEN, ..., ...) = -1 EINPROGRESS
(Operation is now in progress)
   +0 > S 0:0(0) <mss 1460,nop,nop,sackOK,nop,wscale 8,FO,nop,nop>
 +.01 < S. 123:123(0) ack 1 win 5840 <mss
1460,nop,nop,sackOK,nop,wscale 6,FO 1234abcd,nop,nop>
   +0 %{ assert tcpi_delivered == 1, tcpi_delivered }%
   +0 > . 1:1(0) ack 1
 +.01 close(3) = 0
   +0 > F. 1:1(0) ack 1
 +.01 < F. 1:1(0) ack 2 win 92
   +0 > .  2:2(0) ack 2


//
// Test: the data in SYN-data are retransmitted when server acks only ISN
//
   +0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 4
   +0 fcntl(4, F_SETFL, O_RDWR|O_NONBLOCK) = 0
   +0 sendto(4, ..., 700, MSG_FASTOPEN, ..., ...) = 700
   +0 > S 0:700(700) <mss 1460,nop,nop,sackOK,nop,wscale 8,FO 1234abcd,nop,nop>
 +.02 < S. 0:0(0) ack 1 win 5840 <mss 1040,nop,nop,sackOK,nop,wscale
6,FO 1234abcd,nop,nop>
// Retransmits data in SYN-data on the first ACK.
   +0 %{ assert tcpi_delivered == 1, tcpi_delivered }%
   +0 %{ assert tcpi_bytes_acked == 1, tcpi_bytes_acked }%
   +0 %{ assert tcpi_bytes_sent == 2*700, tcpi_bytes_sent }%
   +0 > P. 1:701(700) ack 1
   +0 < P. 1:1001(1000) ack 701 win 257
   +0 %{ assert tcpi_delivered == 2, tcpi_delivered }%
   +0 %{ assert tcpi_bytes_acked == 701, tcpi_bytes_acked }%
   +0 > . 701:701(0) ack 1001
   +0 read(4, ..., 1024) = 1000
   +0 %{ assert (tcpi_options & TCPI_OPT_SYN_DATA) == 0, tcpi_options }%
   +0 close(4) = 0
   +0 > F. 701:701(0) ack 1001
 +.02 < F. 1001:1001(0) ack 702 win 257
   +0 > . 702:702(0) ack 1002
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help