Re: RFC(v2): Audit Kernel Container IDs

From: Richard Guy Briggs <hidden>
Date: 2017-12-11 15:10:57
Also in: cgroups, linux-api, linux-fsdevel, lkml

On 2017-12-09 11:20, Mickaël Salaün wrote:

On 12/10/2017 18:33, Casey Schaufler wrote:

quoted

On 10/12/2017 7:14 AM, Richard Guy Briggs wrote:

quoted

Containers are a userspace concept.  The kernel knows nothing of them.

The Linux audit system needs a way to be able to track the container
provenance of events and actions.  Audit needs the kernel's help to do
this.

Since the concept of a container is entirely a userspace concept, a
registration from the userspace container orchestration system initiates
this.  This will define a point in time and a set of resources
associated with a particular container with an audit container ID.

The registration is a pseudo filesystem (proc, since PID tree already
exists) write of a u8[16] UUID representing the container ID to a file
representing a process that will become the first process in a new
container.  This write might place restrictions on mount namespaces
required to define a container, or at least careful checking of
namespaces in the kernel to verify permissions of the orchestrator so it
can't change its own container ID.  A bind mount of nsfs may be
necessary in the container orchestrator's mntNS.
Note: Use a 128-bit scalar rather than a string to make compares faster
and simpler.

Require a new CAP_CONTAINER_ADMIN to be able to carry out the
registration.

Hang on. If containers are a user space concept, how can
you want CAP_CONTAINER_ANYTHING? If there's not such thing as
a container, how can you be asking for a capability to manage
them?

quoted

  At that time, record the target container's user-supplied
container identifier along with the target container's first process
(which may become the target container's "init" process) process ID
(referenced from the initial PID namespace), all namespace IDs (in the
form of a nsfs device number and inode number tuple) in a new auxilliary
record AUDIT_CONTAINER with a qualifying op=$action field.

Here is an idea to avoid privilege problems or the need for a new
capability: make it automatic. What makes a container a container seems
to be the use of at least a namespace. What about automatically create
and assign an ID to a process when it enters a namespace different than
one of its parent process? This delegates the (permission)
responsibility to the use of namespaces (e.g. /proc/sys/user/max_* limit).

A container doesn't imply a namespace and vice versa.

One interesting side effect of this approach would be to be able to
identify which processes are in the same set of namespaces, even if not
spawn from the container but entered after its creation (i.e. using
setns), by creating container IDs as a (deterministic) checksum from the
/proc/self/ns/* IDs.

This would be really helpful, but it isn't the case.

Since the concern is to identify a container, I think the ability to
audit the switch from one container ID to another is enough. I don't
think we need nested IDs.

Since container namespace membership is arbitrary between container
orchestrators, this needs a registration process and a way for the
container orchestrator to know the ID.


I completely agree with Casey here.

As a side note, you may want to take a look at the Linux-VServer's XID.

Regards,
 Mickaël

- RGB

--
Richard Guy Briggs [off-list ref]
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help