Thread (20 messages) 20 messages, 8 authors, 2017-11-10

Re: [REGRESSION] Warning in tcp_fastretrans_alert() of net/ipv4/tcp_input.c

From: Yuchung Cheng <hidden>
Date: 2017-11-06 22:28:13
Subsystem: networking [general], networking [tcp], the rest · Maintainers: "David S. Miller", Eric Dumazet, Jakub Kicinski, Paolo Abeni, Neal Cardwell, Linus Torvalds

Possibly related (same subject, not in this thread)

On Fri, Oct 27, 2017 at 1:38 PM, Eric Dumazet [off-list ref] wrote:
On Wed, Oct 25, 2017 at 10:37 PM, Yuchung Cheng [off-list ref] wrote:
quoted
On Wed, Oct 25, 2017 at 7:07 PM, Alexei Starovoitov
[off-list ref] wrote:
quoted
On Thu, Sep 28, 2017 at 04:36:58PM -0700, Yuchung Cheng wrote:
quoted
On Thu, Sep 28, 2017 at 1:14 AM, Oleksandr Natalenko
[off-list ref] wrote:
quoted
Hi.

Won't tell about panic in tcp_sacktag_walk() since I cannot trigger it
intentionally, but setting net.ipv4.tcp_retrans_collapse to 0 *does not* fix
warning in tcp_fastretrans_alert() for me.
Hi Oleksandr: no retrans_collapse should not matter for that warning
in tcp_fstretrans_alert(). the warning as I explained earlier is
hi guys can you try if the warning goes away w/ this quick fix?

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 0ada8bfc2ebd..072aab2a8226 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -2626,7 +2626,7 @@ void tcp_simple_retransmit(struct sock *sk)

        tcp_clear_retrans_hints_partial(tp);

-       if (prior_lost == tp->lost_out)
+       if (!tp->lost_out)
                return;

        if (tcp_is_reno(tp))


quoted hunk ↗ jump to hunk
quoted
quoted
quoted
likely false. Neal and I are more concerned the panic in
tcp_sacktag_walk. This is just a blind shot but thx for retrying.

We can submit a one-liner to remove the fast retrans warning but want
to nail the bigger issue first.
we're still seeing the warnings followed by crashes and it's very concerning.
We hoped that most recent Neal's patches from Sep 18 around this area may
magically fix the issue, but no. The panics are still there.
It's confirmed that net.ipv4.tcp_retrans_collapse=0 does not help
whereas net.ipv4.tcp_recovery=0 works, but obviously undesirable.
We're out of ideas on how to debug this.
Can you try Eric's latest SACK rb-tree patches?
https://patchwork.ozlabs.org/cover/822218/

Roman's SNMP data suggests MTU probing is enabled. Another blind shot
is to disable it.

Or alternatively try this fix :
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 1151870018e345592853b035a0902121c41e268d..6a849c7028f06f31b36a906be37995b28b579a40
100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2062,6 +2062,8 @@ static int tcp_mtu_probe(struct sock *sk)
        nskb->ip_summed = skb->ip_summed;

        tcp_insert_write_queue_before(nskb, skb, sk);
+       if (skb == tp->highest_sack)
+               tp->highest_sack = nskb;

        len = 0;
        tcp_for_write_queue_from_safe(skb, next, sk) {
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help