Re: TCP NewReno and single retransmit

From: Marcelo Ricardo Leitner <hidden>
Date: 2014-11-03 16:38:09

On 31-10-2014 01:51, Yuchung Cheng wrote:

On Thu, Oct 30, 2014 at 7:24 PM, Marcelo Ricardo Leitner
[off-list ref] wrote:

quoted

On 30-10-2014 00:03, Neal Cardwell wrote:

quoted

On Mon, Oct 27, 2014 at 2:49 PM, Marcelo Ricardo Leitner
[off-list ref] wrote:

quoted

Hi,

We have a report from a customer saying that on a very calm connection,
like
having only a single data packet within some minutes, if this packet gets
to
be re-transmitted, retrans_stamp is only cleared when the next acked
packet
is received. But this may make we abort the connection too soon if this
next
packet also gets lost, because the reference for the initial loss is
still
for a big while ago..

...

quoted

@@ -2382,31 +2382,32 @@ static inline bool tcp_may_undo(const struct

tcp_sock *tp)
   static bool tcp_try_undo_recovery(struct sock *sk)

...

quoted

          if (tp->snd_una == tp->high_seq && tcp_is_reno(tp)) {
                  /* Hold old state until something *above* high_seq
                   * is ACKed. For Reno it is MUST to prevent false
                   * fast retransmits (RFC2582). SACK TCP is safe. */

Or we can just remove this strange state-holding logic?

I couldn't find such a "MUST" statement in RFC2582. RFC2582 section 3
step 5 suggests exiting the recovery procedure when an ACK acknowledges
the "recover" variable (== tp->high_seq - 1).

Since we've called tcp_reset_reno_sack() before tcp_try_undo_recovery(),
I couldn't see how false fast retransmits can be triggered without
this state-holding.

Any insights?

Nice one, me neither. Neal?

 From RFC2582, Section 5, Avoiding Multiple Fast Retransmits:

    Nevertheless, unnecessary Fast Retransmits can occur with Reno or
    NewReno TCP, particularly if a Retransmit Timeout occurs during Fast
    Recovery.  (This is illustrated for Reno on page 6 of [F98], and for
    NewReno on page 8 of [F98].)  With NewReno, the data sender remains
    in Fast Recovery until either a Retransmit Timeout, or *until all of
    the data outstanding when Fast Retransmit was entered has been
    acknowledged*.  Thus with NewReno, the problem of multiple Fast
    Retransmits from a single window of data can only occur after a
    Retransmit Timeout.

Bolding mark is mine. If I didn't miss anything, as that condition was met, we 
should be good to keep that cwnd reduction (required by section 3 step 5) and 
but get back to Open state right away.

Marcelo

quoted

                  tcp_moderate_cwnd(tp);
+               tp->retrans_stamp = 0;
                  return true;
          }
          tcp_set_ca_state(sk, TCP_CA_Open);
          return false;
   }

We would still hold state, at least part of it.. WDYT?


This approach sounds OK to me as long as we include a check of
tcp_any_retrans_done(), as we do in the similar code paths (for
motivation, see the comment above tcp_any_retrans_done()).


Yes, okay. I thought that this would be taken care of already by then but
reading the code again now after your comment, I can see what you're saying.
Thanks.

quoted

So it sounds fine to me if you change that one new line to the following
2:

+  if (!tcp_any_retrans_done(sk))
+    tp->retrans_stamp = 0;


Will do.

quoted

Nice catch!


A good part of it (including the diagram) was done by customer. :)
I'll post the patch as soon as we sync with them (credits).

Marcelo

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help