Re: [PATCH] tcp: check socket state before calling WARN_ON
From: Eric Dumazet <edumazet@google.com>
Date: 2025-02-25 18:43:33
Also in:
lkml
On Tue, Feb 25, 2025 at 7:28 PM Yuchung Cheng [off-list ref] wrote:
On Tue, Feb 25, 2025 at 9:25 AM Neal Cardwell [off-list ref] wrote:quoted
On Mon, Feb 24, 2025 at 4:13 PM Neal Cardwell [off-list ref] wrote:quoted
On Mon, Feb 3, 2025 at 12:17 AM Youngmin Nam [off-list ref] wrote:quoted
quoted
Hi Neal, Thank you for looking into this issue. When we first encountered this issue, we also suspected that tcp_write_queue_purge() was being called. We can provide any information you would like to inspect.Thanks again for raising this issue, and providing all that data! I've come up with a reproducer for this issue, and an explanation for why this has only been seen on Android so far, and a theory about a related socket leak issue, and a proposed fix for the WARN and the socket leak. Here is the scenario: + user process A has a socket in TCP_ESTABLISHED + user process A calls close(fd) + socket calls __tcp_close() and tcp_close_state() decides to enter TCP_FIN_WAIT1 and send a FIN + FIN is lost and retransmitted, making the state: --- tp->packets_out = 1 tp->sacked_out = 0 tp->lost_out = 1 tp->retrans_out = 1 --- + someone invokes "ss" to --kill the socket using the functionality in (1e64e298b8 "net: diag: Support destroying TCP sockets") https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c1e64e298b8cad309091b95d8436a0255c84f54a (note: this was added for Android, so would not be surprising to have this inet_diag --kill run on Android) + the ss --kill causes a call to tcp_abort() + tcp_abort() calls tcp_write_queue_purge() + tcp_write_queue_purge() sets packets_out=0 but leaves lost_out=1, retrans_out=1 + tcp_sock still exists in TCP_FIN_WAIT1 but now with an inconsistent state + ACK arrives and causes a WARN_ON from tcp_verify_left_out(): #define tcp_verify_left_out(tp) WARN_ON(tcp_left_out(tp) > tp->packets_out) because the state has: --- tcp_left_out(tp) = sacked_out + lost_out = 1 tp->packets_out = 0 --- because the state is: --- tp->packets_out = 0 tp->sacked_out = 0 tp->lost_out = 1 tp->retrans_out = 1 --- I guess perhaps one fix would be to just have tcp_write_queue_purge() zero out those other fields: --- tp->sacked_out = 0 tp->lost_out = 0 tp->retrans_out = 0 --- However, there is a related and worse problem. Because this killed socket has tp->packets_out, the next time the RTO timer fires,Zeroing all inflights stats in tcp_write_queue_purge still makes sense to me. Why will the RTO timer still fire if packets_out is zeroed?
By definition, tcp_write_queue_purge() must only happen when the
socket reaches a final state.
No further transmit is possible, since this broke a major TCP
principle (stream mode, no sendmsg() can be zapped)
tcp_write_timer_handler() immediately returns if the final state is reached.
if (((1 << sk->sk_state) & (TCPF_CLOSE | TCPF_LISTEN)) ..
return;
Also look at INET_CSK_CLEAR_TIMERS, if you want to know why the
retransmit timer can fire.