Thread (38 messages) 38 messages, 8 authors, 2016-10-05

Re: [Documentation] State of CPU controller in cgroup v2

From: James Bottomley <hidden>
Date: 2016-08-21 05:34:21
Also in: cgroups, lkml

On Wed, 2016-08-17 at 13:18 -0700, Andy Lutomirski wrote:
On Aug 5, 2016 7:07 PM, "Tejun Heo" [off-list ref] wrote:
[...]
quoted
2. Disagreements and Arguments

There have been several lengthy discussion threads [3][4] on LKML
around the structural constraints of cgroup v2.  The two that 
affect the CPU controller are process granularity and no internal 
process constraint.  Both arise primarily from the need for common 
resource domain definition across different resources.

The common resource domain is a powerful concept in cgroup v2 that
allows controllers to make basic assumptions about the structural
organization of processes and controllers inside the cgroup 
hierarchy, and thus solve problems spanning multiple types of 
resources.  The prime example for this is page cache writeback: 
dirty page cache is regulated through throttling buffered writers 
based on memory availability, and initiating batched write outs to 
the disk based on IO capacity.  Tracking and controlling writeback 
inside a cgroup thus requires the direct cooperation of the memory 
and the IO controller.

This easily extends to other areas, such as CPU cycles consumed 
while performing memory reclaim or IO encryption.


2-1. Contentious Restrictions

For controllers of different resources to work together, they must
agree on a common organization.  This uniform model across 
controllers imposes two contentious restrictions on the CPU 
controller: process granularity and the no-internal-process
constraint.


  2-1-1. Process Granularity

  For memory, because an address space is shared between all
threads
  of a process, the terminal consumer is a process, not a thread.
  Separating the threads of a single process into different memory
  control domains doesn't make semantical sense.  cgroup v2 ensures
  that all controller can agree on the same organization by
requiring
  that threads of the same process belong to the same cgroup.
I haven't followed all of the history here, but it seems to me that
this argument is less accurate than it appears.  Linux, for better or
for worse, has somewhat orthogonal concepts of thread groups
(processes), mms, and file tables.  An mm has VMAs in it, and VMAs 
can reference things (files, etc) that hold resources.  (Two mms can
share resources by mapping the same thing or using fork().)  File 
tables hold files, and files can use resources.  Both of these are, 
at best, moderately good approximations of what actually holds 
resources. Meanwhile, threads (tasks) do syscalls, take page faults, 
*allocate* resources, etc.

So I think it's not really true to say that the "terminal consumer" 
of anything is a process, not a thread.

While it's certainly easier to think about assigning processes to
cgroups, and I certainly agree that, in the common case, it's the
right thing to do, I don't see why requiring it is a good idea.  Can
we turn this around: what actually goes wrong if cgroup v2 were to
allow assigning individual threads if a user specifically requests
it?
A similar point from a different consumer: from the unprivileged
containers point of view, I'm interested in a thread based interface as
well.  The principle utility of unprivileged containers is to allow
applications that wish to to use container properties (effectively to
become self-containerising).  Some that use the producer/consumer model
do use process pools (apache springs to mind instantly) but some use
thread pools.  It is useful to the latter to preserve the concept of a
thread as being the entity inhabiting the cgroup (but only where the
granularity of the cgroup permits threads to participate) so we can
easily modify them to be self containerising without forcing them to
switch back from a thread pool model to a process pool model.

I can see that process based is conceptually easier in v2 because you
begin with a process tree, but it would really be a pity to lose the
thread based controls we have now and permanently lose the ability to
create more as we find uses for them.  I can't really see how improving
"common resource domain" is a good tradeoff for this.

James
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help