Thread (76 messages) 76 messages, 9 authors, 2016-10-19

Re: [RFC v3 18/22] cgroup,landlock: Add CGRP_NO_NEW_PRIVS to handle unprivileged hooks

From: Sargun Dhillon <hidden>
Date: 2016-09-20 04:37:53
Also in: cgroups, linux-api, lkml

On Thu, Sep 15, 2016 at 09:41:33PM +0200, Mickaël Salaün wrote:
On 15/09/2016 06:48, Alexei Starovoitov wrote:
quoted
On Wed, Sep 14, 2016 at 09:38:16PM -0700, Andy Lutomirski wrote:
quoted
On Wed, Sep 14, 2016 at 9:31 PM, Alexei Starovoitov
[off-list ref] wrote:
quoted
On Wed, Sep 14, 2016 at 09:08:57PM -0700, Andy Lutomirski wrote:
quoted
On Wed, Sep 14, 2016 at 9:00 PM, Alexei Starovoitov
[off-list ref] wrote:
quoted
On Wed, Sep 14, 2016 at 07:27:08PM -0700, Andy Lutomirski wrote:
quoted
quoted
quoted
quoted
This RFC handle both cgroup and seccomp approaches in a similar way. I
don't see why building on top of cgroup v2 is a problem. Is there
security issues with delegation?
What I mean is: cgroup v2 delegation has a functionality problem.
Tejun says [1]:

We haven't had to face this decision because cgroup has never properly
supported delegating to applications and the in-use setups where this
happens are custom configurations where there is no boundary between
system and applications and adhoc trial-and-error is good enough a way
to find a working solution.  That wiggle room goes away once we
officially open this up to individual applications.

Unless and until that changes, I think that landlock should stay away
from cgroups.  Others could reasonably disagree with me.
Ours and Sargun's use cases for cgroup+lsm+bpf is not for security
and not for sandboxing. So the above doesn't matter in such contexts.
lsm hooks + cgroups provide convenient scope and existing entry points.
Please see checmate examples how it's used.
To be clear: I'm not arguing at all that there shouldn't be
bpf+lsm+cgroup integration.  I'm arguing that the unprivileged
landlock interface shouldn't expose any cgroup integration, at least
until the cgroup situation settles down a lot.
ahh. yes. we're perfectly in agreement here.
I'm suggesting that the next RFC shouldn't include unpriv
and seccomp at all. Once bpf+lsm+cgroup is merged, we can
argue about unpriv with cgroups and even unpriv as a whole,
since it's not a given. Seccomp integration is also questionable.
I'd rather not have seccomp as a gate keeper for this lsm.
lsm and seccomp are orthogonal hook points. Syscalls and lsm hooks
don't have one to one relationship, so mixing them up is only
asking for trouble further down the road.
If we really need to carry some information from seccomp to lsm+bpf,
it's easier to add eBPF support to seccomp and let bpf side deal
with passing whatever information.
As an argument for keeping seccomp (or an extended seccomp) as the
interface for an unprivileged bpf+lsm: seccomp already checks off most
of the boxes for safely letting unprivileged programs sandbox
themselves.
you mean the attach part of seccomp syscall that deals with no_new_priv?
sure, that's reusable.
quoted
Furthermore, to the extent that there are use cases for
unprivileged bpf+lsm that *aren't* expressible within the seccomp
hierarchy, I suspect that syscall filters have exactly the same
problem and that we should fix seccomp to cover it.
not sure what you mean by 'seccomp hierarchy'. The normal process
hierarchy ?
Kind of.  I mean the filter layers that are inherited across fork(),
the TSYNC mechanism, etc.
quoted
imo the main deficiency of secccomp is inability to look into arguments.
One can argue that it's a blessing, since composite args
are not yet copied into the kernel memory.
But in a lot of cases the seccomp arguments are FDs pointing
to kernel objects and if programs could examine those objects
the sandboxing scope would be more precise.
lsm+bpf solves that part and I'd still argue that it's
orthogonal to seccomp's pass/reject flow.
I mean if seccomp says 'ok' the syscall should continue executing
as normal and whatever LSM hooks were triggered by it may have
their own lsm+bpf verdicts.
I agree with all of this...
quoted
Furthermore in the process hierarchy different children
should be able to set their own lsm+bpf filters that are not
related to parallel seccomp+bpf hierarchy of programs.
seccomp syscall can be an interface to attach programs
to lsm hooks, but nothing more than that.
I'm not sure what you mean.  I mean that, logically, I think we should
be able to do:

seccomp(attach a syscall filter);
fork();
child does seccomp(attach some lsm filters);

I think that they *should* be related to the seccomp+bpf hierarchy of
programs in that they are entries in the same logical list of filter
layers installed.  Some of those layers can be syscall filters and
some of the layers can be lsm filters.  If we subsequently add a way
to attach a removable seccomp filter or a way to attach a seccomp
filter that logs failures to some fd watched by an outside monitor, I
think that should work for lsm, too, with more or less the same
interface.

If we need a way for a sandbox manager to opt different children into
different subsets of fancy filters, then I think that syscall filters
and lsm filters should use the same mechanism.

I think we might be on the same page here and just saying it different ways.
Sounds like it :)
All of the above makes sense to me.
The 'orthogonal' part is that the user should be able to use
this seccomp-managed hierarchy without actually enabling
TIF_SECCOMP for the task and syscalls should still go through
fast path and all the way till lsm hooks as normal.
I don't want to pay _any_ performance penalty for this feature
for lsm hooks (and all syscalls) that don't have bpf programs attached.
Yes, it seems that we are all on the same page here, and that match this
RFC implementation. So, using the seccomp(2) *interface* to attach
Landlock programs to a process hierarchy is still on track. :)
So, I'm catching up on this after a little while away. I really like the 
simplicity of the approach Daniel took with his patches. I began to have 
difficulty reading your patchset once you got into using seccomp + unprivileged 
mode. I would love to see a separate patchset that only have the verifier, and
lsm hook changes. Do you think you could decompose your patchset into an MVP?
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help