Re: [PATCH v5 0/3] implement OA2_CRED_INHERIT flag for openat2()

From: Andy Lutomirski <luto@amacapital.net>
Date: 2024-04-28 20:19:20
Also in: linux-fsdevel, lkml

On Apr 28, 2024, at 10:39 AM, stsp [off-list ref] wrote:

28.04.2024 19:41, Andy Lutomirski пишет:

quoted

On Apr 26, 2024, at 6:39 AM, Stas Sergeev [off-list ref] wrote:

This patch-set implements the OA2_CRED_INHERIT flag for openat2() syscall.
It is needed to perform an open operation with the creds that were in
effect when the dir_fd was opened, if the dir was opened with O_CRED_ALLOW
flag. This allows the process to pre-open some dirs and switch eUID
(and other UIDs/GIDs) to the less-privileged user, while still retaining
the possibility to open/create files within the pre-opened directory set.

Then two different things could be done:

1. The subtree could be used unmounted or via /proc magic links. This
would be for programs that are aware of this interface.

2. The subtree could be mounted, and accessed through the mount would
use the captured creds.

Doesn't this have the same problem
that was pointed to me? Namely (explaining
my impl first), that if someone puts the cred
fd to an unaware process's fd table, such
process can't fully drop its privs. He may not
want to access these dirs, but once its hacked,
the hacker will access these dirs with the
creds came from an outside.

This is not a real problem. If I have a writable fd for /etc/shadow or
an fd for /dev/mem, etc, then I need close them to fully drop privs.

The problem is that, in current kernels, directory fds don’t allow
access using their f_cred, and changing that means that existing
software that does not intend to create a capability will start to do
so.

My solution was to close such fds on
exec and disallowing SCM_RIGHTS passage.

I don't see what problem this solves.

SCM_RIGHTS can be allowed in the future,
but the receiver will need to use some
new flag to indicate that he is willing to
get such an fd. Passage via exec() can
probably never be allowed however.

If I understand your model correctly, you
put a magic sub-tree to the fs scope of some
unaware process.

Only if I want that process to have access!

He may not want to access
it, but once hacked, the hacker will access
it with the creds from an outside.
And, unlike in my impl, in yours there is
probably no way to prevent that?

This is fundamental to the whole model. If I stick a FAT formatted USB
drive in the system and mount it, then any process that can find its
way to the mountpoint can write to it.  And if I open a dirfd, any
process with that dirfd can write it.  This is old news and isn't a
problem.

In short: my impl confines the hassle within
the single process. It can be extended, and
then the receiver will need to explicitly allow
adding such fds to his fd table.
But your idea seems to inherently require
2 processes, and there is probably no way
for the second process to say "ok, I allow
such sub-tree in my fs scope".

A process could use my proposal to open_tree directory, make a whole
new mountns that is otherwise empty, switch to the mountns, mount the
directory, then change its UID and drop all privs, and still have
access to that one directory.

But even right now, a process could unshare userns and mountns, map
its uid as some nonzero number, rearrange its mountns so it only has
access to that one directory, drop all privs, and seccomp itself, and
only have access to that directory, as it still has the same UID.
Take your pick.

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help