Thread (6 messages) 6 messages, 3 authors, 2010-02-09

Re: [PATCH 0/3][v2] tcp: fix ICMP-RTO war

From: Damian Lukowski <hidden>
Date: 2010-02-09 22:29:13

Am 09.02.2010, 13:37 Uhr, schrieb Ilpo Järvinen  
[off-list ref]:
On Mon, 8 Feb 2010, Damian Lukowski wrote:
quoted
Damian Lukowski schrieb:
quoted
Am 01.02.2010, 08:33 Uhr, schrieb David Miller [off-list ref]:
quoted
From: Damian Lukowski <redacted>
Date: Fri, 29 Jan 2010 23:15:51 +0100
quoted
This patches fix the current RTO calculation routine, when
srtt and rttvar are zero, yielding an RTO of zero
Under some circumstances, TCPs srtt and rttvar are zero,
yielding a calculated RTO of zero.
This is particularly unfortunate for ICMP based RTO recalculation
as introduced in f1ecd5d9e736660 (Revert Backoff [v3]: Revert RTO
on ICMP destination unreachable), as it results in RTO  
retransmission
quoted
quoted
quoted
flooding.

Thanks to Ilpo Jarvinen for providing debug patches and to
Denys Fedoryshchenko for reporting and testing.

Signed-off-by: Damian Lukowski <redacted>
I still haven't seen a detailed enough analysis of why these
tiny RTOs can come to exist in the first place.

Please show me a list of events, function by function, the value of
relevant variables and per-socket TCP state, in the TCP stack, that
show how this ends up happening.

Thanks for all of your work on this so far.
I might have figured it out, but could not verify it, so maybe you can
comment my thought.

When a listening TCP receives a SYN, it will send a SYN+ACK
and wait for an ACK to complete the handshake.
Look at tcp_rcv_state_process::step 5::case SYN_RECV::acceptable
and the code after the comment "tcp_ack considers this ACK as  
duplicate
quoted
and does not calculate rtt".

If the connecting client has disabled timestamps, the rtt statistics
won't be updated here, while the state is changed above.
I printk'ed at the very end of TCP_SYN_RECV and got the following:
state 1 (ESTABLISHED), srtt 0, rttvar 0.

So my suspicion is: If connectivity breaks right after a listening TCP
has completed the handshake without timestamps, and the listening TCP
sends data after establishing the connection, we will get the observed
behaviour.
Just verified that by dropping pure ACKs coming from the originally
listening TCP using iptables.
Isn't that "tcp_ack considers" comment like this: we have bug elsewhere  
in
the code but workaround it here, at least for some of the cases? (I'd put
it other way around: but alas, only for part of the cases.) ...It sound
like that to me. What exactly is the reason why rtt shouldn't be
calculated/initialized on such ACK, anything I'm missing?

The RTT is extracted when traversing tcp_write_queue_head() in
tcp_clean_rtx_queue() but is null when the SYNACK is acknowledged,
so it may have seemed to be a harder issue for the commentator.
I also have been thinking a while how to obtain the timestamp of the
sent SYNACK, but there is no need for it.
We just need to call tcp_ack_no_tstamp(), which will set the
RTO to 3 seconds as defined in RFC 2988.

Regards
   Damian
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help