Re: [PATCH] tcp: Do not reset the icsk_ca_initialized in tcp_init_transfer.
From: Nguyen Dinh Phi <hidden>
Date: 2021-06-29 12:28:29
Also in:
bpf, linux-kernel-mentees, lkml
On June 29, 2021 4:21:59 PM GMT+08:00, Eric Dumazet [off-list ref] wrote:
On Tue, Jun 29, 2021 at 9:17 AM Nguyen Dinh Phi [off-list ref] wrote:quoted
On June 29, 2021 1:20:19 AM GMT+08:00, Neal Cardwell[off-list ref] wrote:quoted
quoted
) On Mon, Jun 28, 2021 at 1:15 PM Phi Nguyen [off-list ref]wrote:quoted
quoted
quoted
On 6/29/2021 12:24 AM, Neal Cardwell wrote:quoted
Thanks. Can you also please provide a summary of the event sequence that triggers the bug? Based on your Reported-by tag, I guess this isbasedquoted
quoted
on the syzbot reproducer:https://groups.google.com/g/syzkaller-bugs/c/VbHoSsBz0hk/m/cOxOoTgPCAAJquoted
quoted
quoted
but perhaps you can give a summary of the event sequence thatcausesquoted
quoted
the bug? Is it that the call: setsockopt$inet_tcp_TCP_CONGESTION(r0, 0x6, 0xd, &(0x7f0000000000)='cdg\x00', 0x4) initializes the CC and happens before the connection isestablished,quoted
quoted
and then when the connection is established, the line that sets: icsk->icsk_ca_initialized = 0; is incorrect, causing the CC to be initialized again withoutfirstquoted
quoted
quoted
quoted
calling the cleanup code that deallocates the CDG-allocatedmemory?quoted
quoted
quoted
quoted
thanks, nealHi Neal, The gdb stack trace that lead to init_transfer_input() is asbellow,quoted
quoted
thequoted
current sock state is TCP_SYN_RECV.Thanks. That makes sense as a snapshot of time for tcp_init_transfer(), but I think what would be more useful would beaquoted
quoted
description of the sequence of events, including when the CC was initialized previous to that point (as noted above, was it that the setsockopt(TCP_CONGESTION) completed before that point?). thanks, nealI resend my message because I accidently used html format in lastone. I am very sorry for the inconvenience caused.quoted
--- Yes, the CC had been initialized by the setsockopt, after that, itwas initialized again in function tcp_init_transfer() because of setting isck_ca_initialized to 0. "the setsockopt" is rather vague, sorry. The hard part is that all scenarios have to be considered. TCP flows can either be passive and active. CC can be set : 1) Before the connect() or accept() 2) After the connect() or accept() 3) after the connect() but before 3WHS is completed. So we need to make sure all cases will still work with any combination of CDG CC (before/after) in the picture. Note that a memory leak for a restricted CC (CDG can only be used by CAP_NET_ADMIN privileged user) is a small problem compared to more serious bug that could be added by an incomplete fix. I also note that if icsk_ca_priv] was increased from 104 to 120 bytes, tcp_cdg would no longer need a dynamic memory allocation. Thank you.
Hi, I will try to see whether I am able to get the full sequence. I am also affraid of making a change that could affect big part of the kernel. About CDG, how we can get rid of dynamic allocation by increasing icsk_priv_data to 120? because I see that the window size is a module parameter, so I guess it is not a fixed value. Because the problem only happens with CDG, is adding check in its tcp_cdg_init() function Ok? And about icsk_ca_initialized, Could I expect it to be 0 in CC's init functions? Thank you.