Thread (22 messages) 22 messages, 4 authors, 2023-05-24

Re: [PATCH net-next 3/3] net: tcp: handle window shrink properly

From: Menglong Dong <hidden>
Date: 2023-05-24 12:16:23
Also in: lkml

On Tue, May 23, 2023 at 9:27 PM Neal Cardwell [off-list ref] wrote:
quoted
Oh, I understand what you mean now. You are saying that
retransmit that first packet in the retransmit queue instead
of zero-window probe packet when OOM of the receiver,
isn't it? In other word, retransmit the unacked data and ignore
the tcp_retries2 when we find the receiver is in OOM state.
Yes. The idea would be to use a heuristic to estimate the receiver is
currently OOM and use ICSK_TIME_PROBE0 / tcp_probe_timer() /
tcp_write_wakeup() in this case instead of ICSK_TIME_RETRANS /
tcp_retransmit_timer().
Well, I think that maybe we should use ICSK_TIME_PROBE0 /
tcp_probe_timer() / tcp_retransmit_skb(), isn't it?

What tcp_write_wakeup() does is send new data if the receive
window available, which means push new data into the retransmit
queue. However, what we need to do now is retransmit the first
packet in the rtx queue, isn't it?

In the tcp_ack(), we estimate that if the receiver is OOM and
mark the sk with OOM state, and raise ICSK_TIME_PROBE0.
When new data is acked, we leave the OOM state.

The OOM state can only happen when the rtx queue is not empty,
otherwise the tcp connection will enter normal zero-window probe
state. So when the timeout of ICSK_TIME_PROBE0, we need
retransmit the skb in the rtx queue.

tcp_write_wakeup() don't do the job the retransmit packet, but
send new data.

Am I right?

Thanks!
Menglong Dong
quoted
That's an option, and we can make the length of the data we
send to 1 byte, which means we keep retransmitting the first
byte that has not be acked in the retransmit queue.
I don't think it would be worth adding special-case code to only send
1 byte.  If the data receiver is not OOM then for CPU and memory
efficiency reasons (as well as simplicity) the data sender should send
it a full MSS. So for those reasons I would suggest that in this
potential approach tcp_write_wakeup() should stay the same.

neal
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help