Thread (8 messages) 8 messages, 3 authors, 2009-09-26

Re: TCP stack bug related to F-RTO?

From: Joe Cao <hidden>
Date: 2009-09-26 20:48:28
Also in: lkml

Possibly related (same subject, not in this thread)

Hi Ilpo,

Thanks for the replay.  We noticed the problem while we were debugging a connection failure case reported by one of our customers (we are a network device vendor).  Actually we have suggested our customer to upgrade their server software to fix the problem, and we are still waiting for the feedback from them.  Meanwhile, I asked all those questions just because I want to understand the issue and the fixes.  We also has to convince the customer to move to a right kernel and don't want them to come back with the same problem again.

Again, thanks for the help!

Joe
--- On Sat, 9/26/09, Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> wrote:
quoted hunk ↗ jump to hunk
From: Ilpo Järvinen <redacted>
Subject: Re: TCP stack bug related to F-RTO?
To: "Joe Cao" <redacted>
Cc: "Ray Lee" <redacted>, "Netdev" <redacted>, "LKML" <redacted>
Date: Saturday, September 26, 2009, 10:51 AM
On Sat, 26 Sep 2009, Joe Cao wrote:
quoted
Can you elaborate on "Some retransmission would happen
here as step 3"?  
quoted
When the second timeout happens, it will again go into
FRTO and then 
quoted
retransmit the write queue head.
Why do you think that the second RTO will happen with
anything else than 
with 2.6.24. And it's perfectly ok to go into FRTO for the
second time.
quoted
I looked at the patch (debian Bug#478062) that's
probably what you 
quoted
mentioned as the fix. All it does was to exclude the
SACK case when 
quoted
considering FRTO.  But in my case, SACK was
enabled, as seen in the 
quoted
trace..
You should be looking from where I said rather than picking
up your own 
sources and assuming that they'll tell you all the story
:-). In fact, 
there are two fixes that were made in a row and one
workaround in the
same timeframe. ...And you managed to pick the wrong one of
the fixes, so 
I kind of understand why you got confused :-).
quoted
In other words, do we still have a problem with FRTO
when SACK is 
quoted
enabled in the latest kernel?
For sure we might have all kinds of problems no one has yet

noticed/reported :-). ....However, it seems that this
particular problem 
your trace is showing is solved. Can you please test with a
fixed kernel 
before coming back here with these claims.


-- 
 i.
--- On Fri, 9/25/09, Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
wrote:
quoted
From: Ilpo Järvinen <redacted>
Subject: Re: TCP stack bug related to F-RTO?
To: "Joe Cao" <redacted>
Cc: "Ray Lee" <redacted>,
"Netdev" [off-list ref],
"LKML" [off-list ref]
quoted
Date: Friday, September 25, 2009, 11:03 AM
On Fri, 25 Sep 2009, Joe Cao wrote:
quoted
Thanks for the reply!  Do you happen to know
which patch fixed the 
quoted
problem?
You can find those patches from the stable queue git
tree.
quoted
I gave you hint 
from what release to look from in the last mail.
However,
quoted
as 2.6.24 is 
anyway obsolete my recommendation is that you should
probably consider 
upgrading to fix all the other bugs that have been
found
quoted
since 2.6.24 was 
obsoleted.
quoted
Is there a bug tracking system for linux kernel?
Nothing that knows everything about everything.
quoted
I studied the FRTO code in latest kernel
2.6.31.. 
quoted
It seems the problem 
quoted
is still there:  

1. Every time a RTO fires, because
tcp_is_sackfrto(tp)
quoted
returns 1, 
quoted
tcp_use_frto() returns true.  And the server
tcp
quoted
enters FRTO.
quoted
2. After the head of write queue is
retransmitted, two
quoted
new data packets 
quoted
are transmitted, the server receives two
dup-ACKs.  That will make the 
quoted
TCP enter tcp_enter_frto_loss(), however, that
only
quoted
rests ssthresh and 
quoted
some other fields.
Perhaps those other fields are far more important than
you
quoted
think... :-)
...Some retransmission would happen here as step 3.
quoted
3. After another longer RTO fires, because
tcp_is_sackfrto(tp) returns 
quoted
1, tcp_use_frto() again returns true.  The
stack
quoted
enters FRTO again.
quoted
4. The above repeats and the stack couldn't
retransmits the lost packets 
quoted
faster.

Is my understanding above correct?
...No. All magic that happens in tcp_enter_frto_loss
should
quoted
be enough to 
really do more than a single retransmission (that is,
in
quoted
any other than 
2.6.24 series kernel). There was an unfortunate bug in
this
quoted
area in 2.6.24 
which basically undoed the effect of correct actions
tcp_enter_frto_loss 
did which effectively prevented
tcp_xmit_retransmit_queue
quoted
from doing its 
part.

-- 
 i.
--- On Fri, 9/25/09, Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
wrote:
quoted
From: Ilpo Järvinen <redacted>
Subject: Re: TCP stack bug related to F-RTO?
To: "Ray Lee" <redacted>
Cc: "Joe Cao" <redacted>,
"Netdev" [off-list ref],
"LKML" [off-list ref],
jcaoco2002@yahoo.com
quoted
Date: Friday, September 25, 2009, 6:09 AM
On Thu, 24 Sep 2009, Ray Lee wrote:
quoted
[adding netdev cc:]

On Thu, Sep 24, 2009 at 10:43 AM, Joe Cao
[off-list ref]
quoted
quoted
wrote:
quoted
quoted
Hello,

I have found the following behavior
with
quoted
quoted
different versions of linux 
quoted
quoted
kernel. The attached pcap trace is
collected
quoted
with
quoted
server 
quoted
quoted
(192.168.0.13) running 2.6.24 and shows
the
quoted
quoted
problem. Basically the 
quoted
quoted
behavior is like this: 

1. The client opens up a big window,
2. the server sends 19 packets in a row
(pkt
quoted
#14-
quoted
#32 in the trace), but all of them are dropped
due to
quoted
some
quoted
congestion.
quoted
quoted
3. The server hits RTO and retransmits
pkt
quoted
#14 in
quoted
#33
quoted
quoted
4. The client immediately acks #33
(=#14),
quoted
and
quoted
the server (seems like to enter F-RTO) expends
the
quoted
window
quoted
and sends *NEW* pkt #35 & #36.=A0 Timeoute
is
quoted
doubled to
quoted
2*RTO; The client immediately sends two Dup-ack
to #35
quoted
and
quoted
#36.
quoted
quoted
5. after 2*RTO, pkt #15 is
retransmitted in
quoted
#39.
quoted
quoted
quoted
6. The client immediately acks #39
(=#15) in
quoted
#40,
quoted
and the server continues to expand the window
and
quoted
sends two
quoted
*NEW* pkt #41 & #42. Now the timeoute is
doubled
quoted
to 4
quoted
*RTO.
quoted
quoted
8. After 4*RTO timeout, #16 is
retransmitted.
quoted
quoted
quoted
9....
10. The above steps repeats for
retransmitting
quoted
pkt #16-#32 and each time the timeout is
doubled.
quoted
quoted
quoted
quoted
11. It takes a long long time to
retransmit
quoted
all
quoted
the lost packets and before that is done, the
client
quoted
sends a
quoted
RST because of timeout.
quoted
quoted
The above behavior looks like F-RTO is
in
quoted
effect.
quoted
 And there seems to 
quoted
quoted
be a bug in the TCP's congestion
control
quoted
and
quoted
retransmission algorithm. 
quoted
quoted
Why doesn't the TCP on server (running
2.6.24)
quoted
enter the slow start? 
quoted
quoted
Why should the server take that long
to
quoted
recover
quoted
from a short period 
quoted
quoted
of packet loss?

Has anyone else noticed similar
problem
quoted
before?
quoted
 If my analysis was 
quoted
quoted
wrong, can anyone gives me some
pointers to
quoted
quoted
what's really wrong and 
quoted
quoted
how to fix it?
Yes, 2.6.24 is an obsoleted version with known
wrongs
quoted
in
quoted
FRTO 
implementation. Fixes never when to 2.6.24
stable
quoted
series as
quoted
it was 
_already_ obsoleted when the problems where
reported
quoted
and
quoted
found. The 
correct fixes may be found from 2.6.25.7 (.7
iirc) and
quoted
are
quoted
included from 
2.6.26 onward too.

Just in case you happen to run ubuntu based
kernel
quoted
from
quoted
that era (of 
course you should be reporting the bug here
then...),
quoted
a
quoted
word of warning: 
it seemed nearly impossible for them to get a
simple
quoted
thing
quoted
like that 
fixed, I haven't been looking if they'd
eventually
quoted
come to
quoted
some sensible 
conclusion in that matter or is it still
unresolved
quoted
(or
quoted
e.g., closed 
without real resolution).

      
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help