[PATCH 1/2] af_unix: Do not wait for garbage collector in sendmsg()
From: Nam Cao <hidden>
Date: 2026-07-01 16:36:04
Also in:
linux-rt-devel, lkml
Subsystem:
networking [general], networking [unix sockets], the rest · Maintainers:
"David S. Miller", Eric Dumazet, Jakub Kicinski, Paolo Abeni, Kuniyuki Iwashima, Linus Torvalds
AF_UNIX sockets' sendmsg() schedules and blocks on the garbage collector if user has too many inflight unix sockets. This causes real-time issues, as high priority processes who do need to send lots of unix sockets get blocked by the garbage collector which runs as workqueue, causing a priority inversion scenario. The reason for blocking on garbage collector goes back to 2008, when it was reported that "Local/unprivileged users can cause soft lockups and take out system processes by triggering the OOM killer": https://bugzilla.redhat.com/show_bug.cgi?id=470201 The soft lockup was because a process can keep queueing AF_UNIX sockets to another process that is exiting. Back in 2008, the garbage collector was run synchronously by the exiting process, therefore keep queueing AF_UNIX sockets blocks that process from exiting. The solution to that issue was forcing sendmsg() to wait for ongoing garbage collector. The OOM killer issue was brought up again in 2010: https://lore.kernel.org/lkml/AANLkTi=Q967xpX0KLMwX-=_4_1AKO5wjHEuJ1TrNjCj9@mail.gmail.com/ (local) To resolve that report, beside blocking on the garbage collector, sendmsg() also schedules the garbage collector if the number of inflight AF_UNIX sockets in the system is too high. Then in 2015, once again, the OOM killer problem was brought up: https://lore.kernel.org/lkml/20151228141435.GA13351@1wt.eu/ (local) That time, the issue was resolved by disallowing a user from having more inflight AF_UNIX sockets than their RLIMIT_NOFILE. That was done by commit 712f4aad406b ("unix: properly account for FDs passed over unix sockets") and commit 415e3d3e90ce ("unix: correctly track in-flight fds in sending process user_struct"). Now, sendmsg() does not have to block on the garbage collector anymore, because: - The OOM killer issue has already been addressed by checking RLIMIT_NOFILE. - The soft lockup issue is no longer relevant, because the garbage collector now runs asynchronously since commit d9f21b361333 ("af_unix: Try to run GC async.") Therefore, remove that to prevent priority inversion. Running all the reproducers from the mentioned bug reports after this patch, no problem is observed. Signed-off-by: Nam Cao <redacted> --- net/unix/garbage.c | 2 -- 1 file changed, 2 deletions(-)
diff --git a/net/unix/garbage.c b/net/unix/garbage.c
index 0783555e2526..f180c59b3da9 100644
--- a/net/unix/garbage.c
+++ b/net/unix/garbage.c@@ -300,8 +300,6 @@ int unix_prepare_fpl(struct scm_fp_list *fpl) if (!fpl->edges) goto err; - unix_schedule_gc(fpl->user); - return 0; err:
--
2.47.3