Re: [PATCH net-next] tcp: fix spurious connection aborts due to TCP_USER_TIMEOUT and zero window
From: Eric Dumazet <edumazet@google.com>
Date: 2026-05-27 13:40:25
Subsystem:
networking [general], networking [tcp], the rest · Maintainers:
"David S. Miller", Eric Dumazet, Jakub Kicinski, Paolo Abeni, Neal Cardwell, Linus Torvalds
On Wed, May 27, 2026 at 1:16 AM Zhi Cheng [off-list ref] wrote:
quoted hunk ↗ jump to hunk
Under certain conditions, a stale icsk_probes_tstamp can lead to an unexpected connection abort during a zero-window state. The exact sequence leading to the timeout is as follows: 1. A zero window occurs. icsk_probes_tstamp is set. 2. The window opens slightly. tcp_ack_probe() is called because there are no in-flight packets. However, icsk_probes_tstamp is not cleared because the window is not large enough to fit the tcp_send_head. 3. Packets been sent, and RTO is armed which overrides the probe timer. 4. Subsequent ACKs consistently acknowledge packets and the window never fully closes. Because there are now in-flight packets, tcp_ack_probe() is never called again. 5. As a result, icsk_probes_tstamp is never updated despite the connection no longer in the zero-window state. 6. Much later, another zero window occurs. When probe timer triggers, tcp_probe_timer() evaluates the extremely old icsk_probes_tstamp and immediately aborts the connection due to TCP_USER_TIMEOUT. Fix this by explicitly clearing icsk_probes_tstamp in tcp_ack() whenever prior_packets is non-zero, ensuring that the probe timestamp is reset when exit zero-window state. Fixes: 9d9b1ee0b2d1 ("tcp: fix TCP_USER_TIMEOUT with zero window") Signed-off-by: Zhi Cheng <redacted> --- net/ipv4/tcp_input.c | 1 + 1 file changed, 1 insertion(+)diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index de9f68a9c0cf..02de64881b76 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c@@ -4365,6 +4365,7 @@ static int tcp_ack(struct sock *sk, const structsk_buff *skb, int flag) tp->rcv_tstamp = tcp_jiffies32; if (!prior_packets) goto no_queue; + icsk->icsk_probes_tstamp = 0; /* See if we can take anything off of the retransmit queue. */ flag |= tcp_clean_rtx_queue(sk, skb, prior_fack, prior_snd_una,
tcp_ack() is TCP fast path and icsk_probes_tstamp was not yet touched in TCP fast path....
diff --git a/Documentation/networking/net_cachelines/inet_connection_sock.rstb/Documentation/networking/net_cachelines/inet_connection_sock.rst index cc2000f55c29879a12c0e4d238242b01cee18091..dfb2ecb4c1621f2eac2f3183ed63057af90dba76 100644
--- a/Documentation/networking/net_cachelines/inet_connection_sock.rst
+++ b/Documentation/networking/net_cachelines/inet_connection_sock.rst@@ -45,7 +45,7 @@ struct icsk_mtup_int search_low read_write struct icsk_mtup_u32:31 probe_size read_write
tcp_mtup_init,tcp_connect_init,__tcp_transmit_skb
struct icsk_mtup_u32:1 enabled read_write
tcp_mtup_init,tcp_sync_mss,tcp_connect_init,tcp_mtu_probe,tcp_write_xmit
struct icsk_mtup_u32 probe_timestamp read_write
tcp_mtup_init,tcp_connect_init,tcp_mtu_check_reprobe,tcp_mtu_probe
-u32 icsk_probes_tstamp
+u32 icsk_probes_tstamp
read_write tcp_ack
u32 icsk_user_timeout
u64[104/sizeof(u64)] icsk_ca_priv
=================================== ======================
=================== ===================
========================================================================================================================================================
An alternative would be to clear icsk_probes_tstamp in a less hot path.
Perhaps:
diff --git a/include/net/tcp.h b/include/net/tcp.h
index f063eccbbba340b39abc79b5541adca369d63d7c..751a407f64c2ba90e7b72d48942506870a994a2e100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h@@ -1631,6 +1631,13 @@ static inline void tcp_reset_xmit_timer(struct sock *sk, unsigned long when, bool pace_delay) { + if (what != ICSK_TIME_PROBE0) { + struct inet_connection_sock *icsk = inet_csk(sk); + + if (icsk->icsk_pending == ICSK_TIME_PROBE0) + icsk->icsk_probes_tstamp = 0; + } + if (pace_delay) when += tcp_pacing_delay(sk); inet_csk_reset_xmit_timer(sk, what, when,