Re: [PATCH RFC DRAFT 00/50] nstree: listns()
From: Christian Brauner <brauner@kernel.org>
Date: 2025-10-24 14:54:25
Also in:
bpf, cgroups, linux-fsdevel, lkml
quoted
So that punches a whole in the active reference count tracking. So this will have to be handled as right now socket file descriptors that pin a network namespace that don't have an active reference anymore (no live processes, not explicit persistence via namespace fds) can't be used to issue a SIOCGSKNS ioctl() to open the associated network namespace.Is this capability something we need to preserve? It seems like the fact that SIOCGSKNS works when there are no active references left might have been an accident. Is there a legit use-case for allowing that?
I've solved that use-case now and have added a large testsuite to verify that it works.
I don't see a problem with active+passive refcounts. They're more complicated to deal with, but we've used them elsewhere so it's a pattern we all know (even if we don't necessarily love them).
+1
I'll also point out that net namespaces already have two refcounts for this exact reason. Do you plan to replace the passive refcount in struct net with the new passive refcount you're implementing here?
Yeah, that's an option. I think that in the future it should also be possible to completely drop the net/ internal network namespace tracking and rely on the nstree infrastructure only. But that's work for the future.
quoted
So two options I see if the api is based on ids: (1) We use the active reference count and somehow also make it work with sockets. (2) The active reference count is not needed and we say that listns() is an introspection system call anyway so we just always list namespaces regardless of why they are still pinned: files, mm_struct, network devices, everything is fair game. (3) Throw hands up in the air and just not do it.Is listns() the only reason we'd need a active/passive refcounts? It seems like we might need them for other reasons (e.g. struct net).
Yes.
IMO, even if you keep the active+passive refcounts, it would be good to be able to tell listns() to return all the namespaces, and not just the ones that are still active. Maybe that can be the first flag for this new syscall?
Certainly possible but that would be pure introspection. But as I said elsewhere, I have implemented the nstree infrastructure in a way that it will allow bpf to walk the namespace trees and that would obviously also include all namespaces that are not active anymore.