Thread (23 messages) 23 messages, 7 authors, 2024-05-08

Re: [PATCH v5 0/3] implement OA2_CRED_INHERIT flag for openat2()

From: Christian Brauner <brauner@kernel.org>
Date: 2024-05-07 07:42:49
Also in: linux-fsdevel, lkml

With my kernel hat on, maybe I agree.  But with my *user* hat on, I
think I pretty strongly disagree.  Look, idmapis lousy for
unprivileged use:

$ install -m 0700 -d test_directory
$ echo 'hi there' >test_directory/file
$ podman run -it --rm
--mount=type=bind,src=test_directory,dst=/tmp,idmap [debian-slim]
$ podman run -it --rm --mount=type=bind,src=test_directory,dst=/tmp,idmap [debian-slim]

as an unprivileged user doesn't use idmapped mounts at all. So I'm not
sure what this is showing. I suppose you're talking about idmaps in
general.
# cat /tmp/file
hi there

<-- Hey, look, this kind of works!

# setpriv --reuid=1 ls /tmp
ls: cannot open directory '/tmp': Permission denied

<-- Gee, thanks, Linux!


Obviously this is a made up example.  But it's quite analogous to a
real example.  Suppose I want to make a directory that will contain
some MySQL data.  I don't want to share this directory with anyone
else, so I set its mode to 0700.  Then I want to fire up an
unprivileged MySQL container, so I build or download it, and then I
run it and bind my directory to /var/lib/mysql and I run it.  I don't
need to think about UIDs or anything because it's 2024 and containers
just work.  Okay, I need to setenforce 0 because I'm on Fedora and
SELinux makes absolutely no sense in a container world, but I can live
with that.

Except that it doesn't work!  Because unless I want to manually futz
with the idmaps to get mysql to have access to the directory inside
the container, only *root* gets to get in.  But I bet that even
futzing with the idmap doesn't work, because software like mysql often
expects that root *and* a user can access data.  And some software
even does privilege separation and uses more than one UID.
If the directory is 700 and it's owned by say root:root on the host and
you want to share that with arbitrary container users then this isn't
something you can do today (ignoring group permissions and ACLs for the
sake of your argument) even on the host so that's not a limitation of
userns or idmapped mounts. That means many to one mappings of uids/gids.
So I want a way to give *an entire container* access to a directory.
Classic UNIX DAC is just *wrong* for this use case.  Maybe idmaps
could learn a way to squash multiple ids down to one.  Or maybe
Many idmappings to one is in principle possible and I've noted that idea
down as a possible extension at
https://github.com/uapi-group/kernel-features quite a while (2 years?) ago.
I haven't looked at the idmap implementation nearly enough to have any
opinion as to whether squashing UID is practical or whether there's
It's doable. The interesting bit to me was that if we want to allow
writes we'd need a way to determine what the uid/gid would be to write
down. Imho, that's not super difficult to solve though. The most obvious
one is that userspace can just determine it when creating the idmapped
mount.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help