Thread (13 messages) 13 messages, 4 authors, 2026-03-03

Re: [RFC PATCH net-next] tcp: Add net.ipv4.tcp_purge_receive_queue sysctl

From: Leon Hwang <hidden>
Date: 2026-02-25 09:48:25
Also in: linux-doc, lkml


On 25/2/26 16:31, Eric Dumazet wrote:
On Wed, Feb 25, 2026 at 8:46 AM Leon Hwang [off-list ref] wrote:
quoted
Introduce a new sysctl knob, net.ipv4.tcp_purge_receive_queue, to
address a memory leak scenario related to TCP sockets.
We use the term "memory leak" for a persistent loss of memory (until reboot)
Thanks for the clarification.
Lets not abuse it and confuse various AI/human agents which will
declare emergency situations
caused by an inexistent fatal error.
I'll reword it in the next revision.
quoted
Issue:
When a TCP socket in the CLOSE_WAIT state receives a RST packet, the
current implementation does not clear the socket's receive queue. This
causes SKBs in the queue to remain allocated until the socket is
explicitly closed by the application. As a consequence:

1. The page pool pages held by these SKBs are not released.
This situation also applies for normal TCP_ESTABLISHED sockets, when
applications
do not drain the receive queue.

As long the application has not called close(), kernel should not
assume the application
will _not_ read the data that was received.
Understood.

This patch provides an option to drain the receive queue in the
CLOSE_WAIT + RST case, instead of purging it unconditionally upon
receiving a RST packet.
quoted
2. The associated page pool cannot be freed.

RFC 9293 Section 3.10.7.4 specifies that when a RST is received in
CLOSE_WAIT state, "all segment queues should be flushed." However, the
current implementation does not flush the receive queue.
Some buggy stacks send RST anyway after FIN. I think that forcingly
purging good data
received before the RST would add many surprises.
Understood.

There is a tcp_write_queue_purge(sk) call in tcp_done_with_error(),
which means sk_write_queue is always purged when a RST packet is
received. I assume the reason for purging sk_write_queue is that any
pending transmissions become meaningless once a RST is received.

Would it be better to defer kb_queue_purge(&sk->sk_receive_queue) until
after tcp_done_with_error()?

[...]
quoted
Please prepare a packetdrill test.
Ack.

I'll add a packetdrill test in the next revision.

Thanks,
Leon
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help