Thread (25 messages) 25 messages, 7 authors, 2017-11-10

Re: [kernel-hardening] Re: [PATCH resend 2/2] userns: control capabilities of some user namespaces

From: Serge E. Hallyn <hidden>
Date: 2017-11-07 03:23:14
Also in: lkml, netdev

On Mon, Nov 06, 2017 at 09:16:03PM -0500, Daniel Micay wrote:
On Mon, 2017-11-06 at 16:14 -0600, Serge E. Hallyn wrote:
quoted
Quoting Daniel Micay (danielmicay-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org):
quoted
Substantial added attack surface will never go away as a problem.
There
aren't a finite number of vulnerabilities to be found.
There's varying levels of usefulness and quality.  There is code which
I
want to be able to use in a container, and code which I can't ever see
a
reason for using there.  The latter, especially if it's also in a
staging driver, would be nice to have a toggle to disable.

You're not advocating dropping the added attack surface, only adding a
way of dealing with an 0day after the fact.  Privilege raising 0days
can
exist anywhere, not just in code which only root in a user namespace
can
exercise.  So from that point of view, ksplice seems a more complete
solution.  Why not just actually fix the bad code block when we know
about it?
That's not what I'm advocating. I only care about it for proactive
attack surface reduction downstream. I have no interest in using it to
block access to known vulnerabilities.
quoted
Finally, it has been well argued that you can gain many new caps from
having only a few others.  Given that, how could you ever be sure
that,
if an 0day is found which allows root in a user ns to abuse
CAP_NET_ADMIN against the host, just keeping CAP_NET_ADMIN from them
would suffice?
I didn't suggest using it that way...
quoted
 It seems to me that the existing control in
/proc/sys/kernel/unprivileged_userns_clone might be the better duct
tape
in that case.
There's no such thing as unprivileged_userns_clone in mainline.
Hm.  I was sure Kees had gotten that in...  I guess I was wrong.
The advantage of this over unprivileged_userns_clone in Debian and maybe
some other distributions is not giving up unprivileged app containers /
sandboxes implemented via user namespaces.  For example, Chromium's user
namespace sandbox likely only needs to have CAP_SYS_CHROOT. Chromium
will be dropping their setuid sandbox, forcing usage of user namespaces
to avoid losing the sandbox which will greatly increase local kernel
attack surface on the host by exposing netfilter management, etc. to
unprivileged users.

The proposed approach isn't necessarily the best way to implement this
kind of mitigation but I think it's filling a real need.
I think I definately prefer what I mentioned in the email to Boris.
Basically a "permanent capability bounding set".  The normal bounding
set gets reset to a full set on every new user_ns creation.  In this
proposal, it would instead be set to the calling task's permanent
capability set, which starts (at boot) full, and which privileged
tasks can pull capabilities out of.

-serge
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help