Re: [PATCH bpf-next v8 00/11] Landlock LSM: Toward unprivileged sandboxing

From: Andy Lutomirski <luto@amacapital.net>
Date: 2018-02-27 04:37:21
Also in: linux-api, linux-security-module, netdev

On Tue, Feb 27, 2018 at 12:41 AM, Mickaël Salaün [off-list ref] wrote:

Hi,

This eight series is a major revamp of the Landlock design compared to
the previous series [1]. This enables more flexibility and granularity
of access control with file paths. It is now possible to enforce an
access control according to a file hierarchy. Landlock uses the concept
of inode and path to identify such hierarchy. In a way, it brings tools
to program what is a file hierarchy.

There is now three types of Landlock hooks: FS_WALK, FS_PICK and FS_GET.
Each of them accepts a dedicated eBPF program, called a Landlock
program.  They can be chained to enforce a full access control according
to a list of directories or files. The set of actions on a file is well
defined (e.g. read, write, ioctl, append, lock, mount...) taking
inspiration from the major Linux LSMs and some other access-controls
like Capsicum.  These program types are designed to be cache-friendly,
which give room for optimizations in the future.

The documentation patch contains some kernel documentation and
explanations on how to use Landlock.  The compiled documentation and
a talk I gave at FOSDEM can be found here: https://landlock.io
This patch series can be found in the branch landlock-v8 in this repo:
https://github.com/landlock-lsm/linux

There is still some minor issues with this patch series but it should
demonstrate how powerful this design may be. One of these issues is that
it is not a stackable LSM anymore, but the infrastructure management of
security blobs should allow to stack it with other LSM [4].

This is the first step of the roadmap discussed at LPC [2].  While the
intended final goal is to allow unprivileged users to use Landlock, this
series allows only a process with global CAP_SYS_ADMIN to load and
enforce a rule.  This may help to get feedback and avoid unexpected
behaviors.

This series can be applied on top of bpf-next, commit 7d72637eb39f
("Merge branch 'x86-jit'").  This can be tested with
CONFIG_SECCOMP_FILTER and CONFIG_SECURITY_LANDLOCK.  I would really
appreciate constructive comments on the design and the code.


# Landlock LSM

The goal of this new Linux Security Module (LSM) called Landlock is to
allow any process, including unprivileged ones, to create powerful
security sandboxes comparable to XNU Sandbox or OpenBSD Pledge. This
kind of sandbox is expected to help mitigate the security impact of bugs
or unexpected/malicious behaviors in user-space applications.

The approach taken is to add the minimum amount of code while still
allowing the user-space application to create quite complex access
rules.  A dedicated security policy language such as the one used by
SELinux, AppArmor and other major LSMs involves a lot of code and is
usually permitted to only a trusted user (i.e. root).  On the contrary,
eBPF programs already exist and are designed to be safely loaded by
unprivileged user-space.

This design does not seem too intrusive but is flexible enough to allow
a powerful sandbox mechanism accessible by any process on Linux. The use
of seccomp and Landlock is more suitable with the help of a user-space
library (e.g.  libseccomp) that could help to specify a high-level
language to express a security policy instead of raw eBPF programs.
Moreover, thanks to the LLVM front-end, it is quite easy to write an
eBPF program with a subset of the C language.


# Frequently asked questions

## Why is seccomp-bpf not enough?

A seccomp filter can access only raw syscall arguments (i.e. the
register values) which means that it is not possible to filter according
to the value pointed to by an argument, such as a file pathname. As an
embryonic Landlock version demonstrated, filtering at the syscall level
is complicated (e.g. need to take care of race conditions). This is
mainly because the access control checkpoints of the kernel are not at
this high-level but more underneath, at the LSM-hook level. The LSM
hooks are designed to handle this kind of checks.  Landlock abstracts
this approach to leverage the ability of unprivileged users to limit
themselves.

Cf. section "What it isn't?" in Documentation/prctl/seccomp_filter.txt


## Why use the seccomp(2) syscall?

Landlock use the same semantic as seccomp to apply access rule
restrictions. It add a new layer of security for the current process
which is inherited by its children. It makes sense to use an unique
access-restricting syscall (that should be allowed by seccomp filters)
which can only drop privileges. Moreover, a Landlock rule could come
from outside a process (e.g.  passed through a UNIX socket). It is then
useful to differentiate the creation/load of Landlock eBPF programs via
bpf(2), from rule enforcement via seccomp(2).

This seems like a weak argument to me.  Sure, this is a bit different
from seccomp(), and maybe shoving it into the seccomp() multiplexer is
awkward, but surely the bpf() multiplexer is even less applicable.

But I think that you have more in common with seccomp() than you're
giving it credit for.  With seccomp, you need to either prevent
ptrace() of any more-privileged task or you need to filter to make
sure you can't trace a more privileged program.  With landlock, you
need exactly the same thing.  You have basically the same no_new_privs
considerations, etc.

Also, looking forward, I think you're going to want a bunch of the
stuff that's under consideration as new seccomp features.  Tycho is
working on a "user notifier" feature for seccomp where, in addition to
accepting, rejecting, or kicking to ptrace, you can send a message to
the creator of the filter and wait for a reply.  I think that Landlock
will want exactly the same feature.

In other words, it really seems to be that you should extend seccomp()
with the ability to attach filters to things that aren't syscall
entry, e.g. file open.

I would also seriously consider doing a scaled-back Landlock variant
first, with the intent of getting the main mechanism into the kernel.
In particular, there are two big sources of complexity in Landlock.
You need to deal with the API for managing bpf programs that filter
various actions beyond just syscall entry, and you need to deal with
giving those filters a way to deal with inodes, paths, etc.  But you
can do the former without the latter.  For example, you could start
with some Landlock-style filters on things that have nothing to do
with files.  For example, you could allow a filter for connecting to
an abstract-namespace unix socket.  Or you could have a hook for
file_receive.  (You couldn't meaningfully filter based on the *path*
of the fd being received without adding all the path infrastructure,
but you could fitler on the *type* of the fd being received.)  Both of
these add new sandboxing abilities that don't currently exist.  In
particular, you can't write a seccomp rule that prevents receiving an
fd using recvmsg() right now unless you block cmsg entirely.  And you
can't write a filter that allows connecting to unix sockets by path
without allowing abstract namespace sockets either.

If you split up Landlock like this then, once you got all the
installation and management of filters down, you could submit patches
to add all the path stuff and deal with that review separately.

What do you all think?

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help