Thread (26 messages) 26 messages, 6 authors, 14h ago

Re: [RFC] Null Namespaces

From: Andy Lutomirski <luto@kernel.org>
Date: 2026-06-29 21:07:09
Also in: linux-arch, linux-fsdevel, lkml

On Mon, Jun 29, 2026 at 4:45 AM Christian Brauner [off-list ref] wrote:
quoted hunk ↗ jump to hunk
But I guess the even simpler model would be to copy what I've been doing
for pidfs:

+static struct path nullfs_root_path = {};
+
+void nullfs_get_root(struct path *path)
+{
+       *path = nullfs_root_path;
+       path_get(path);
+}
+
 static void __init init_mount_tree(void)
 {
        struct vfsmount *mnt, *nullfs_mnt;
@@ -6209,6 +6217,8 @@ static void __init init_mount_tree(void)
        /* Mount mutable rootfs on top of nullfs. */
        root.mnt                = nullfs_mnt;
        root.dentry             = nullfs_mnt->mnt_root;
+       nullfs_root_path.mnt    = nullfs_mnt;
+       pidfs_root_path.dentry  = nullfs_mnt->mnt_root;

        LOCK_MOUNT_EXACT(mp, &root);
        if (unlikely(IS_ERR(mp.parent)))
diff --git a/include/uapi/linux/fcntl.h b/include/uapi/linux/fcntl.h
index aadfbf6e0cb3..f55c87c70b78 100644
--- a/include/uapi/linux/fcntl.h
+++ b/include/uapi/linux/fcntl.h
@@ -124,6 +124,7 @@ struct delegation {

 #define FD_PIDFS_ROOT                  -10002 /* Root of the pidfs filesystem */
 #define FD_NSFS_ROOT                   -10003 /* Root of the nsfs filesystem */
+#define FD_NULLFS_ROOT                 -10004 /* Root of the nullfs filesystem */
 #define FD_INVALID                     -10009 /* Invalid file descriptor: -10000 - EBADF = -10009 */

 /* Generic flags for the *at(2) family of syscalls. */
we then add fchroot() (overdue anyway) and then teach both fchdir() and
fchroot() to honor FD_NULLFS_ROOT. Then a process may shed its fs state
and move itself into nullfs. Restrict *chdir() and *chroot() for said
process via seccomp and it's locked in forever as well.
One thing comes to mind that might need a bit of care: this would give
an API for any task to get an fd to a directory that lives in the init
mount namespace.  It's not at all obvious to me that this is dangerous
or even observable (you're not about to find a setuid program in
nullfs), but I think it's at least worth a tiny bit of consideration.

But if this happens, maybe we could finally land one of the patches to
enable unprivileged chroot?  It's been tried a few times.

https://lore.kernel.org/lkml/0e2f0f54e19bff53a3739ecfddb4ffa9a6dbde4d.1327858005.git.luto@amacapital.net/ (local)

https://lore.kernel.org/all/20210316203633.424794-2-mic@digikod.net/ (local)

I think the need for it has reduced a tiny bit with user namespaces,
as you can sort of emulate it by unsharing your user namespace and
thus getting enough privilege, but this is rather heavyweight and
limiting.


If all of the above landed, then the old chroot /var/empty kludge that
security-minded programs have done for decades could finally be
modernized and not require any privilege :)

Hmm, thinking aloud: every now and then someone brings up the idea of
having an fd (really an OFD) that points to a file or a directory but
carries less in the way of permissions/capabilities than the usual
OFDs.  If we had a way to make an OFD to a directory that forced
RESOLVE_BENEATH (or RESOLVE_IN_ROOT) and that propagated that
restriction to anything you open using it, and if an unprivileged
process could chroot itself to nullfs, then we would be getting quite
close to what Capsicum can do.

--Andy
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help