Thread (5 messages) 5 messages, 2 authors, 2d ago
HOTtoday

[PATCH 1/2] af_unix: Do not wait for garbage collector in sendmsg()

From: Nam Cao <hidden>
Date: 2026-07-01 16:36:04
Also in: linux-rt-devel, lkml
Subsystem: networking [general], networking [unix sockets], the rest · Maintainers: "David S. Miller", Eric Dumazet, Jakub Kicinski, Paolo Abeni, Kuniyuki Iwashima, Linus Torvalds

AF_UNIX sockets' sendmsg() schedules and blocks on the garbage collector if
user has too many inflight unix sockets. This causes real-time issues, as
high priority processes who do need to send lots of unix sockets get
blocked by the garbage collector which runs as workqueue, causing a
priority inversion scenario.

The reason for blocking on garbage collector goes back to 2008, when
it was reported that "Local/unprivileged users can cause soft lockups
and take out system processes by triggering the OOM killer":
https://bugzilla.redhat.com/show_bug.cgi?id=470201

The soft lockup was because a process can keep queueing AF_UNIX sockets to
another process that is exiting. Back in 2008, the garbage collector was
run synchronously by the exiting process, therefore keep queueing AF_UNIX
sockets blocks that process from exiting.

The solution to that issue was forcing sendmsg() to wait for ongoing
garbage collector.

The OOM killer issue was brought up again in 2010:
https://lore.kernel.org/lkml/AANLkTi=Q967xpX0KLMwX-=_4_1AKO5wjHEuJ1TrNjCj9@mail.gmail.com/ (local)

To resolve that report, beside blocking on the garbage collector, sendmsg()
also schedules the garbage collector if the number of inflight AF_UNIX
sockets in the system is too high.

Then in 2015, once again, the OOM killer problem was brought up:
https://lore.kernel.org/lkml/20151228141435.GA13351@1wt.eu/ (local)

That time, the issue was resolved by disallowing a user from having more
inflight AF_UNIX sockets than their RLIMIT_NOFILE. That was done by commit
712f4aad406b ("unix: properly account for FDs passed over unix sockets")
and commit 415e3d3e90ce ("unix: correctly track in-flight fds in sending
process user_struct").

Now, sendmsg() does not have to block on the garbage collector anymore,
because:

  - The OOM killer issue has already been addressed by checking
    RLIMIT_NOFILE.

  - The soft lockup issue is no longer relevant, because the garbage
    collector now runs asynchronously since commit d9f21b361333 ("af_unix:
    Try to run GC async.")

Therefore, remove that to prevent priority inversion. Running all the
reproducers from the mentioned bug reports after this patch, no problem is
observed.

Signed-off-by: Nam Cao <redacted>
---
 net/unix/garbage.c | 2 --
 1 file changed, 2 deletions(-)
diff --git a/net/unix/garbage.c b/net/unix/garbage.c
index 0783555e2526..f180c59b3da9 100644
--- a/net/unix/garbage.c
+++ b/net/unix/garbage.c
@@ -300,8 +300,6 @@ int unix_prepare_fpl(struct scm_fp_list *fpl)
 	if (!fpl->edges)
 		goto err;
 
-	unix_schedule_gc(fpl->user);
-
 	return 0;
 
 err:
-- 
2.47.3
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help