Thread (5 messages) 5 messages, 2 authors, 2025-07-15

Re: [PATCH net-next v4] tcp: extend tcp_retransmit_skb tracepoint with failure reasons

From: Jakub Kicinski <kuba@kernel.org>
Date: 2025-07-14 23:46:26
Also in: linux-trace-kernel, lkml

On Thu, 10 Jul 2025 10:01:38 +0800 (CST) fan.yu9@zte.com.cn wrote:
Background
==========
When TCP retransmits a packet due to missing ACKs, the
retransmission may fail for various reasons (e.g., packets
stuck in driver queues, sequence errors, or routing issues).

The original tcp_retransmit_skb tracepoint:
'commit e086101b150a ("tcp: add a tracepoint for tcp retransmission")'
lacks visibility into these failure causes, making production
diagnostics difficult.

Solution
========
Adds a "result" field to the tcp_retransmit_skb tracepoint,
enumerating with explicit failure cases:
TCP_RETRANS_ERR_DEFAULT (retransmit terminate unexpectedly)
TCP_RETRANS_IN_HOST_QUEUE (packet still queued in driver)
TCP_RETRANS_END_SEQ_ERROR (invalid end sequence)
TCP_RETRANS_NOMEM (retransmit no memory)
TCP_RETRANS_ROUTE_FAIL (routing failure)
TCP_RETRANS_RCV_ZERO_WINDOW (closed receiver window)
Have you tried to use this or perform some analysis of which of these
reasons actually make sense to add? I'd venture a guess that
IN_HOST_QUEUE will dominate in datacenter. Maybe RCV_ZERO_WINDOW
can happen. Tracing ENOMEM is a waste of time, so is this:

 		if (unlikely(before(TCP_SKB_CB(skb)->end_seq, tp->snd_una))) {
            >>>>>	WARN_ON_ONCE(1);  <<<<<<<<
-			return -EINVAL;
+			result = TCP_RETRANS_END_SEQ_ERROR;
-- 
pw-bot: cr
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help