Re: [REGRESSION] Warning in tcp_fastretrans_alert() of net/ipv4/tcp_input.c

From: Yuchung Cheng <hidden>
Date: 2017-11-06 22:28:13
Subsystem: networking [general], networking [tcp], the rest · Maintainers: "David S. Miller", Eric Dumazet, Jakub Kicinski, Paolo Abeni, Neal Cardwell, Linus Torvalds

Possibly related (same subject, not in this thread)

2017-09-21 · Re: [REGRESSION] Warning in tcp_fastretrans_alert() of net/ipv4/tcp_input.c · Yuchung Cheng <hidden>
2017-09-21 · Re: [REGRESSION] Warning in tcp_fastretrans_alert() of net/ipv4/tcp_input.c · Roman Gushchin <hidden>
2017-09-19 · Re: [REGRESSION] Warning in tcp_fastretrans_alert() of net/ipv4/tcp_input.c · Yuchung Cheng <hidden>
2017-09-19 · Re: [REGRESSION] Warning in tcp_fastretrans_alert() of net/ipv4/tcp_input.c · Oleksandr Natalenko <hidden>
2017-09-19 · Re: [REGRESSION] Warning in tcp_fastretrans_alert() of net/ipv4/tcp_input.c · Oleksandr Natalenko <hidden>

On Fri, Oct 27, 2017 at 1:38 PM, Eric Dumazet [off-list ref] wrote:

On Wed, Oct 25, 2017 at 10:37 PM, Yuchung Cheng [off-list ref] wrote:

quoted

On Wed, Oct 25, 2017 at 7:07 PM, Alexei Starovoitov
[off-list ref] wrote:

quoted

On Thu, Sep 28, 2017 at 04:36:58PM -0700, Yuchung Cheng wrote:

quoted

On Thu, Sep 28, 2017 at 1:14 AM, Oleksandr Natalenko
[off-list ref] wrote:

quoted

Hi.

Won't tell about panic in tcp_sacktag_walk() since I cannot trigger it
intentionally, but setting net.ipv4.tcp_retrans_collapse to 0 *does not* fix
warning in tcp_fastretrans_alert() for me.

Hi Oleksandr: no retrans_collapse should not matter for that warning
in tcp_fstretrans_alert(). the warning as I explained earlier is

hi guys can you try if the warning goes away w/ this quick fix?

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 0ada8bfc2ebd..072aab2a8226 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c

@@ -2626,7 +2626,7 @@ void tcp_simple_retransmit(struct sock *sk)

        tcp_clear_retrans_hints_partial(tp);

-       if (prior_lost == tp->lost_out)
+       if (!tp->lost_out)
                return;

        if (tcp_is_reno(tp))

quoted hunk ↗ jump to hunk

quoted

likely false. Neal and I are more concerned the panic in
tcp_sacktag_walk. This is just a blind shot but thx for retrying.

We can submit a one-liner to remove the fast retrans warning but want
to nail the bigger issue first.

we're still seeing the warnings followed by crashes and it's very concerning.
We hoped that most recent Neal's patches from Sep 18 around this area may
magically fix the issue, but no. The panics are still there.
It's confirmed that net.ipv4.tcp_retrans_collapse=0 does not help
whereas net.ipv4.tcp_recovery=0 works, but obviously undesirable.
We're out of ideas on how to debug this.

Can you try Eric's latest SACK rb-tree patches?
https://patchwork.ozlabs.org/cover/822218/

Roman's SNMP data suggests MTU probing is enabled. Another blind shot
is to disable it.


Or alternatively try this fix :

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 1151870018e345592853b035a0902121c41e268d..6a849c7028f06f31b36a906be37995b28b579a40

--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c

@@ -2062,6 +2062,8 @@ static int tcp_mtu_probe(struct sock *sk)
        nskb->ip_summed = skb->ip_summed;

        tcp_insert_write_queue_before(nskb, skb, sk);
+       if (skb == tp->highest_sack)
+               tp->highest_sack = nskb;

        len = 0;
        tcp_for_write_queue_from_safe(skb, next, sk) {

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help