Thread (4 messages) 4 messages, 3 authors, 2021-08-19

Re: [PATCHi, man-pages] mount_namespaces.7: More clearly explain "locked mounts"

From: Eric W. Biederman <hidden>
Date: 2021-08-16 16:03:49
Also in: linux-fsdevel, lkml

Possibly related (same subject, not in this thread)

Michael Kerrisk [off-list ref] writes:
quoted hunk
For a long time, this manual page has had a brief discussion of
"locked" mounts, without clearly saying what this concept is, or
why it exists. Expand the discussion with an explanation of what
locked mounts are, why mounts are locked, and some examples of the
effect of locking.

Thanks to Christian Brauner for a lot of help in understanding
these details.

Reported-by: Christian Brauner <redacted>
Signed-off-by: Michael Kerrisk <redacted>
---

Hello Eric and others,

After some quite helpful info from Chrstian Brauner, I've expanded
the discussion of locked mounts (a concept I didn't really have a
good grasp on) in the mount_namespaces(7) manual page. I would be
grateful to receive review comments, acks, etc., on the patch below.
Could you take a look please?

Cheers,

Michael

 man7/mount_namespaces.7 | 73 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 73 insertions(+)
diff --git a/man7/mount_namespaces.7 b/man7/mount_namespaces.7
index e3468bdb7..97427c9ea 100644
--- a/man7/mount_namespaces.7
+++ b/man7/mount_namespaces.7
@@ -107,6 +107,62 @@ operation brings across all of the mounts from the original
 mount namespace as a single unit,
 and recursive mounts that propagate between
 mount namespaces propagate as a single unit.)
+.IP
+In this context, "may not be separated" means that the mounts
+are locked so that they may not be individually unmounted.
+Consider the following example:
+.IP
+.RS
+.in +4n
+.EX
+$ \fBsudo mkdir /mnt/dir\fP
+$ \fBsudo sh \-c \(aqecho "aaaaaa" > /mnt/dir/a\(aq\fP
+$ \fBsudo mount \-\-bind -o ro /some/path /mnt/dir\fP
+$ \fBls /mnt/dir\fP   # Former contents of directory are invisible
Do we want a more motivating example such as a /proc/sys?

It has been common to mount over /proc files and directories that can be
written to by the global root so that users in a mount namespace may not
touch them.

+.EE
+.in
+.RE
+.IP
+The above steps, performed in a more privileged user namespace,
+have created a (read-only) bind mount that
+obscures the contents of the directory
+.IR /mnt/dir .
+For security reasons, it should not be possible to unmount
+that mount in a less privileged user namespace,
+since that would reveal the contents of the directory
+.IR /mnt/dir .
 > +.IP
+Suppose we now create a new mount namespace
+owned by a (new) subordinate user namespace.
+The new mount namespace will inherit copies of all of the mounts
+from the previous mount namespace.
+However, those mounts will be locked because the new mount namespace
+is owned by a less privileged user namespace.
+Consequently, an attempt to unmount the mount fails:
+.IP
+.RS
+.in +4n
+.EX
+$ \fBsudo unshare \-\-user \-\-map\-root\-user \-\-mount \e\fP
+               \fBstrace \-o /tmp/log \e\fP
+               \fBumount /mnt/dir\fP
+umount: /mnt/dir: not mounted.
+$ \fBgrep \(aq^umount\(aq /tmp/log\fP
+umount2("/mnt/dir", 0)     = \-1 EINVAL (Invalid argument)
+.EE
+.in
+.RE
+.IP
+The error message from
+.BR mount (8)
+is a little confusing, but the
+.BR strace (1)
+output reveals that the underlying
+.BR umount2 (2)
+system call failed with the error
+.BR EINVAL ,
+which is the error that the kernel returns to indicate that
+the mount is locked.
Do you want to mention that you can unmount the entire subtree?  Either
with pivot_root if it is locked to "/" or with
"umount -l /path/to/propagated/directory".
quoted hunk
 .IP *
 The
 .BR mount (2)
@@ -128,6 +184,23 @@ settings become locked
 when propagated from a more privileged to
 a less privileged mount namespace,
 and may not be changed in the less privileged mount namespace.
+.IP
+This point can be illustrated by a continuation of the previous example.
+In that example, the bind mount was marked as read-only.
+For security reasons,
+it should not be possible to make the mount writable in
+a less privileged namespace, and indeed the kernel prevents this,
+as illustrated by the following:
+.IP
+.RS
+.in +4n
+.EX
+$ \fBsudo unshare \-\-user \-\-map\-root\-user \-\-mount \e\fP
+               \fBmount \-o remount,rw /mnt/dir\fP
+mount: /mnt/dir: permission denied.
+.EE
+.in
+.RE
 .IP *
 .\" (As of 3.18-rc1 (in Al Viro's 2014-08-30 vfs.git#for-next tree))
 A file or directory that is a mount point in one namespace that is not
Eric
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help