Re: [PATCH v2 1/2] tcp/dcpp: Un-pin tw_timer
From: Eric Dumazet <edumazet@google.com>
Date: 2023-11-20 17:56:12
Also in:
linux-rt-users, lkml
On Wed, Nov 15, 2023 at 10:05 PM Valentin Schneider [off-list ref] wrote:
quoted hunk ↗ jump to hunk
The TCP timewait timer is proving to be problematic for setups where scheduler CPU isolation is achieved at runtime via cpusets (as opposed to statically via isolcpus=domains). What happens there is a CPU goes through tcp_time_wait(), arming the time_wait timer, then gets isolated. TCP_TIMEWAIT_LEN later, the timer fires, causing interference for the now-isolated CPU. This is conceptually similar to the issue described in e02b93124855 ("workqueue: Unbind kworkers before sending them to exit()") Keep softirqs disabled, but make the timer un-pinned and arm it *after* the hashdance. This introduces the following (non-fatal) race: CPU0 CPU1 allocates a tw insert it in hash table finds the TW and removes it (timer cancel does nothing) arms a TW timer, lasting This partially reverts ed2e92394589 ("tcp/dccp: fix timewait races in timer handling") and ec94c2696f0b ("tcp/dccp: avoid one atomic operation for timewait hashdance") This also reinstores a comment from ec94c2696f0b ("tcp/dccp: avoid one atomic operation for timewait hashdance") as inet_twsk_hashdance() had a "Step 1" and "Step 3" comment, but the "Step 2" had gone missing. Link: https://lore.kernel.org/all/ZPhpfMjSiHVjQkTk@localhost.localdomain/ (local) Signed-off-by: Valentin Schneider <vschneid@redhat.com> --- net/dccp/minisocks.c | 16 +++++++--------- net/ipv4/inet_timewait_sock.c | 20 +++++++++++++++----- net/ipv4/tcp_minisocks.c | 16 +++++++--------- 3 files changed, 29 insertions(+), 23 deletions(-)diff --git a/net/dccp/minisocks.c b/net/dccp/minisocks.c index 64d805b27adde..2f0fad4255e36 100644 --- a/net/dccp/minisocks.c +++ b/net/dccp/minisocks.c@@ -53,16 +53,14 @@ void dccp_time_wait(struct sock *sk, int state, int timeo) if (state == DCCP_TIME_WAIT) timeo = DCCP_TIMEWAIT_LEN; - /* tw_timer is pinned, so we need to make sure BH are disabled - * in following section, otherwise timer handler could run before - * we complete the initialization. - */ - local_bh_disable(); - inet_twsk_schedule(tw, timeo); - /* Linkage updates. - * Note that access to tw after this point is illegal. - */ + local_bh_disable(); + + // Linkage updates inet_twsk_hashdance(tw, sk, &dccp_hashinfo); + inet_twsk_schedule(tw, timeo);
We could arm a timer there, while another thread/cpu found the TW in the ehash table.
+ // Access to tw after this point is illegal. + inet_twsk_put(tw);
This would eventually call inet_twsk_free() while the timer is armed. I think more work is needed. Perhaps make sure that a live timer owns a reference on tw->tw_refcnt (This is not the case atm)