Thread (7 messages) 7 messages, 3 authors, 2018-10-03

Re: WARN_ON in TLP causing RT throttling

From: Yuchung Cheng <hidden>
Date: 2018-09-28 01:34:42

On Wed, Sep 26, 2018 at 5:09 PM, Eric Dumazet [off-list ref] wrote:


On 09/26/2018 04:46 PM, stranche@codeaurora.org wrote:
quoted
Hi Eric,

Someone recently reported a crash to us on the 4.14.62 kernel where excessive
WARNING prints were spamming the logs and causing watchdog bites. The kernel
does have the following commit by Soheil:
bffd168c3fc5 "tcp: clear tp->packets_out when purging write queue"

Before this bug we see over 1 second of continuous WARN_ON prints from
tcp_send_loss_probe() like so:

7795.530450:   <2>  tcp_send_loss_probe+0x194/0x1b8
7795.534833:   <2>  tcp_write_timer_handler+0xf8/0x1c4
7795.539492:   <2>  tcp_write_timer+0x4c/0x74
7795.543348:   <2>  call_timer_fn+0xc0/0x1b4
7795.547113:   <2>  run_timer_softirq+0x248/0x81c

Specifically, the prints come from the following check:

    /* Retransmit last segment. */
    if (WARN_ON(!skb))
        goto rearm_timer;

Since skb is always NULL, we know there's nothing on the write queue or the
retransmit queue, so we just keep resetting the timer, waiting for more data
to be queued. However, we were able to determine that the TCP socket is in the
TCP_FIN_WAIT1 state, so we will no longer be sending any data and these queues
remain empty.

Would it be appropriate to stop resetting the TLP timer if we detect that the
connection is starting to close and we have no more data to send the probe with,
or is there some way that this scenario should already be handled?

Unfortunately, we don't have a reproducer for this crash.
Something is fishy.

If there is no skb in the queues, then tp->packets_out should be 0,
therefore tcp_rearm_rto() should simply call inet_csk_clear_xmit_timer(sk, ICSK_TIME_RETRANS);

I have never seen this report before.
Do you use Fast Open? I am wondering if its a bug when a TFO server
closes the socket before the handshake finishes...

Either way, it's pretty safe to just stop TLP if write queue is empty
for any unexpected reason.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help