Thread (51 messages) 51 messages, 11 authors, 2018-10-08

Re: [PATCH 0/3] namei: implement various scoping AT_* flags

From: Christian Brauner <christian@brauner.io>
Date: 2018-09-30 20:35:29
Also in: linux-fsdevel, linux-kselftest, lkml

On September 30, 2018 3:54:31 PM GMT+02:00, Alban Crequy [off-list ref] wrote:
On Sat, Sep 29, 2018 at 12:35 PM Aleksa Sarai [off-list ref]
wrote:
quoted
The need for some sort of control over VFS's path resolution (to
avoid
quoted
malicious paths resulting in inadvertent breakouts) has been a very
long-standing desire of many userspace applications. This patchset is
a
quoted
revival of Al Viro's old AT_NO_JUMPS[1] patchset with a few
additions.
quoted
The most obvious change is that AT_NO_JUMPS has been split as
dicussed
quoted
in the original thread, along with a further split of AT_NO_PROCLINKS
which means that each individual property of AT_NO_JUMPS is now a
separate flag:

  * Path-based escapes from the starting-point using "/" or ".." are
    blocked by AT_BENEATH.
  * Mountpoint crossings are blocked by AT_XDEV.
  * /proc/$pid/fd/$fd resolution is blocked by AT_NO_PROCLINKS (more
        correctly it actually blocks any user of nd_jump_link()
because it
quoted
        allows out-of-VFS path resolution manipulation).

AT_NO_JUMPS is now effectively (AT_BENEATH|AT_XDEV|AT_NO_PROCLINKS).
At
quoted
Linus' suggestion in the original thread, I've also implemented
AT_NO_SYMLINKS which just denies _all_ symlink resolution (including
"proclink" resolution).
It seems quite useful to me.
quoted
An additional improvement was made to AT_XDEV. The original
AT_NO_JUMPS
quoted
path didn't consider "/tmp/.." as a mountpoint crossing -- this patch
blocks this as well (feel free to ask me to remove it if you feel
this
quoted
is not sane).

Currently I've only enabled these for openat(2) and the stat(2)
family.
quoted
I would hope we could enable it for basically every *at(2) syscall --
but many of them appear to not have a @flags argument and thus we'll
need to add several new syscalls to do this. I'm more than happy to
send
quoted
those patches, but I'd prefer to know that this preliminary work is
acceptable before doing a bunch of copy-paste to add new sets of
*at(2)
quoted
syscalls.
What do you think of an equivalent feature AT_NO_SYMLINKS flag for
mount()?
That's something we discussed  but that would need to be part of the new mount API work by David. The current mount API doesn't take AT_* flags since it doesn't operate on fds and we're (sort of) out of mount flags.
I guess that would have made the fix for CVE-2017-1002101 in
Kubernetes easier to write:
https://kubernetes.io/blog/2018/04/04/fixing-subpath-volume-vulnerability/
quoted
One additional feature I've implemented is AT_THIS_ROOT (I imagine
this
quoted
is probably going to be more contentious than the refresh of
AT_NO_JUMPS, so I've included it in a separate patch). The patch
itself
quoted
describes my reasoning, but the shortened version of the premise is
that
quoted
continer runtimes need to have a way to resolve paths within a
potentially malicious rootfs. Container runtimes currently do this in
userspace[2] which has implicit race conditions that are not
resolvable
quoted
in userspace (or use fork+exec+chroot and SCM_RIGHTS passing which is
inefficient). AT_THIS_ROOT allows for per-call chroot-like semantics
for
quoted
path resolution, which would be invaluable for us -- and the
implementation is basically identical to AT_BENEATH (except that we
don't return errors when someone actually hits the root).

I've added some selftests for this, but it's not clear to me whether
they should live here or in xfstests (as far as I can tell there are
no
quoted
other VFS tests in selftests, while there are some tests that look
like
quoted
generic VFS tests in xfstests). If you'd prefer them to be included
in
quoted
xfstests, let me know.

[1]: https://lore.kernel.org/patchwork/patch/784221/
[2]: https://github.com/cyphar/filepath-securejoin

Aleksa Sarai (3):
  namei: implement O_BENEATH-style AT_* flags
  namei: implement AT_THIS_ROOT chroot-like path resolution
  selftests: vfs: add AT_* path resolution tests

 fs/fcntl.c                                    |   2 +-
 fs/namei.c                                    | 158
++++++++++++------
quoted
 fs/open.c                                     |  10 ++
 fs/stat.c                                     |  15 +-
 include/linux/fcntl.h                         |   3 +-
 include/linux/namei.h                         |   8 +
 include/uapi/asm-generic/fcntl.h              |  20 +++
 include/uapi/linux/fcntl.h                    |  10 ++
 tools/testing/selftests/Makefile              |   1 +
 tools/testing/selftests/vfs/.gitignore        |   1 +
 tools/testing/selftests/vfs/Makefile          |  13 ++
 tools/testing/selftests/vfs/at_flags.h        |  40 +++++
 tools/testing/selftests/vfs/common.sh         |  37 ++++
 .../selftests/vfs/tests/0001_at_beneath.sh    |  72 ++++++++
 .../selftests/vfs/tests/0002_at_xdev.sh       |  54 ++++++
 .../vfs/tests/0003_at_no_proclinks.sh         |  50 ++++++
 .../vfs/tests/0004_at_no_symlinks.sh          |  49 ++++++
 .../selftests/vfs/tests/0005_at_this_root.sh  |  66 ++++++++
 tools/testing/selftests/vfs/vfs_helper.c      | 154
+++++++++++++++++
quoted
 19 files changed, 707 insertions(+), 56 deletions(-)
 create mode 100644 tools/testing/selftests/vfs/.gitignore
 create mode 100644 tools/testing/selftests/vfs/Makefile
 create mode 100644 tools/testing/selftests/vfs/at_flags.h
 create mode 100644 tools/testing/selftests/vfs/common.sh
 create mode 100755
tools/testing/selftests/vfs/tests/0001_at_beneath.sh
quoted
 create mode 100755 tools/testing/selftests/vfs/tests/0002_at_xdev.sh
 create mode 100755
tools/testing/selftests/vfs/tests/0003_at_no_proclinks.sh
quoted
 create mode 100755
tools/testing/selftests/vfs/tests/0004_at_no_symlinks.sh
quoted
 create mode 100755
tools/testing/selftests/vfs/tests/0005_at_this_root.sh
quoted
 create mode 100644 tools/testing/selftests/vfs/vfs_helper.c

--
2.19.0
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help