Re: [PATCH 5/5] cgroup: introduce cgroup namespaces
From: Andy Lutomirski <hidden>
Date: 2014-07-18 16:51:57
Also in:
cgroups, lkml
On Jul 17, 2014 1:56 PM, "Aditya Kali" [off-list ref] wrote:
On Thu, Jul 17, 2014 at 12:57 PM, Andy Lutomirski [off-list ref] wrote:quoted
On Thu, Jul 17, 2014 at 12:52 PM, Aditya Kali [off-list ref] wrote:quoted
Introduce the ability to create new cgroup namespace. The newly created cgroup namespace remembers the 'struct cgroup *root_cgrp' at the point of creation of the cgroup namespace. The task that creates the new cgroup namespace and all its future children will now be restricted only to the cgroup hierarchy under this root_cgrp. In the first version, setns() is not supported for cgroup namespaces. The main purpose of cgroup namespace is to virtualize the contents of /proc/self/cgroup file. Processes inside a cgroup namespace are only able to see paths relative to their namespace root. This allows container-tools (like libcontainer, lxc, lmctfy, etc.) to create completely virtualized containers without leaking system level cgroup hierarchy to the task.What happens if someone moves a task in a cgroup namespace outside of the namespace root cgroup?Attempt to move a task outside of cgroupns root will fail with EPERM. This is true irrespective of the privileges of the process attempting this. Once cgroupns is created, the task will be confined to the cgroup hierarchy under its cgroupns root until it dies.
Can a task in a non-init userns create a cgroupns? If not, that's unusual. If so, is it problematic if they can prevent themselves from being moved? I hate to say it, but it might be worth requiring explicit permission from the cgroup manager for this. For example, there could be a new cgroup attribute may_unshare, and any attempt to unshare the cgroup ns will fail with -EPERM unless the caller is in a may_share=1 cgroup. may_unshare in a parent cgroup would not give child cgroups the ability to unshare. --Andy