Thread (6 messages) 6 messages, 3 authors, 2010-02-09

Re: [PATCH 0/3][v2] tcp: fix ICMP-RTO war

From: Ilpo Järvinen <hidden>
Date: 2010-02-09 12:37:39

On Mon, 8 Feb 2010, Damian Lukowski wrote:
Damian Lukowski schrieb:
quoted
Am 01.02.2010, 08:33 Uhr, schrieb David Miller [off-list ref]:
quoted
From: Damian Lukowski <redacted>
Date: Fri, 29 Jan 2010 23:15:51 +0100
quoted
This patches fix the current RTO calculation routine, when
srtt and rttvar are zero, yielding an RTO of zero
Under some circumstances, TCPs srtt and rttvar are zero,
yielding a calculated RTO of zero.
This is particularly unfortunate for ICMP based RTO recalculation
as introduced in f1ecd5d9e736660 (Revert Backoff [v3]: Revert RTO
on ICMP destination unreachable), as it results in RTO retransmission
flooding.

Thanks to Ilpo Jarvinen for providing debug patches and to
Denys Fedoryshchenko for reporting and testing.

Signed-off-by: Damian Lukowski <redacted>
I still haven't seen a detailed enough analysis of why these
tiny RTOs can come to exist in the first place.

Please show me a list of events, function by function, the value of
relevant variables and per-socket TCP state, in the TCP stack, that
show how this ends up happening.

Thanks for all of your work on this so far.
I might have figured it out, but could not verify it, so maybe you can
comment my thought.

When a listening TCP receives a SYN, it will send a SYN+ACK
and wait for an ACK to complete the handshake.
Look at tcp_rcv_state_process::step 5::case SYN_RECV::acceptable
and the code after the comment "tcp_ack considers this ACK as duplicate
and does not calculate rtt".

If the connecting client has disabled timestamps, the rtt statistics
won't be updated here, while the state is changed above.
I printk'ed at the very end of TCP_SYN_RECV and got the following:
state 1 (ESTABLISHED), srtt 0, rttvar 0.

So my suspicion is: If connectivity breaks right after a listening TCP
has completed the handshake without timestamps, and the listening TCP
sends data after establishing the connection, we will get the observed
behaviour.
Just verified that by dropping pure ACKs coming from the originally
listening TCP using iptables.
Isn't that "tcp_ack considers" comment like this: we have bug elsewhere in 
the code but workaround it here, at least for some of the cases? (I'd put 
it other way around: but alas, only for part of the cases.) ...It sound 
like that to me. What exactly is the reason why rtt shouldn't be 
calculated/initialized on such ACK, anything I'm missing?

-- 
 i.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help