Thread (115 messages) 115 messages, 7 authors, 2010-08-04

Re: [PATCH repost] sched: export sched_set/getaffinity to modules

From: Sridhar Samudrala <hidden>
Date: 2010-07-14 23:26:45
Also in: kvm, lkml
Subsystem: the rest · Maintainer: Linus Torvalds

On Tue, 2010-07-13 at 14:09 +0300, Michael S. Tsirkin wrote: 
On Mon, Jul 12, 2010 at 11:59:08PM -0700, Sridhar Samudrala wrote:
quoted
On 7/4/2010 2:00 AM, Michael S. Tsirkin wrote:
quoted
On Fri, Jul 02, 2010 at 11:06:37PM +0200, Oleg Nesterov wrote:
quoted
On 07/02, Peter Zijlstra wrote:
quoted
On Fri, 2010-07-02 at 11:01 -0700, Sridhar Samudrala wrote:
quoted
 Does  it (Tejun's kthread_clone() patch) also  inherit the
cgroup of the caller?
Of course, its a simple do_fork() which inherits everything just as you
would expect from a similar sys_clone()/sys_fork() call.
Yes. And I'm afraid it can inherit more than we want. IIUC, this is called
from ioctl(), right?

Then the new thread becomes the natural child of the caller, and it shares
->mm with the parent. And files, dup_fd() without CLONE_FS.

Signals. Say, if you send SIGKILL to this new thread, it can't sleep in
TASK_INTERRUPTIBLE or KILLABLE after that. And this SIGKILL can be sent
just because the parent gets SIGQUIT or abother coredumpable signal.
Or the new thread can recieve SIGSTOP via ^Z.

Perhaps this is OK, I do not know. Just to remind that kernel_thread()
is merely clone(CLONE_VM).

Oleg.
Right. Doing this might break things like flush.  The signal and exit
behaviour needs to be examined carefully. I am also unsure whether
using such threads might be more expensive than inheriting kthreadd.
Should we just leave it to the userspace to set the cgroup/cpumask
after qemu starts the guest and
the vhost threads?

Thanks
Sridhar
Yes but we can't trust userspace to do this. It's important
to do it on thread creation: if we don't, malicious userspace
can create large amount of work exceeding the cgroup limits.

And the same applies so the affinity: if the qemu process
is limited to a set of CPUs, it's important to make
the kernel thread that does work our behalf limited to the same
set of CPUs.

This is not unique to vhost, it's just that virt scenarious are affected
by this more: people seem to run untrusted applications and expect the
damage to be contained.
OK. So we want to create a thread that is a child of kthreadd, but inherits the cgroup/cpumask
from the caller. How about an exported kthread function kthread_create_in_current_cg() 
that does this?
diff --git a/include/linux/kthread.h b/include/linux/kthread.h
index aabc8a1..e0616f0 100644
--- a/include/linux/kthread.h
+++ b/include/linux/kthread.h
@@ -9,6 +9,9 @@ struct task_struct *kthread_create(int (*threadfn)(void *data),
 				   const char namefmt[], ...)
 	__attribute__((format(printf, 3, 4)));
 
+struct task_struct *kthread_create_in_current_cg(int (*threadfn)(void *data),
+						 void *data, char *name);
+
 /**
  * kthread_run - create and wake a thread.
  * @threadfn: the function to run until signal_pending(current).
diff --git a/kernel/kthread.c b/kernel/kthread.c
index 83911c7..ea4e737 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -15,6 +15,7 @@
 #include <linux/module.h>
 #include <linux/mutex.h>
 #include <trace/events/sched.h>
+#include <linux/cgroup.h>
 
 static DEFINE_SPINLOCK(kthread_create_lock);
 static LIST_HEAD(kthread_create_list);
@@ -149,6 +150,42 @@ struct task_struct *kthread_create(int (*threadfn)(void *data),
 }
 EXPORT_SYMBOL(kthread_create);
 
+struct task_struct *kthread_create_in_current_cg(int (*threadfn)(void *data),
+						 void *data, char *name)
+{
+	struct task_struct *worker;
+	cpumask_var_t mask;
+	int ret = -ENOMEM;
+
+	if (!alloc_cpumask_var(&mask, GFP_KERNEL))
+		goto out_free_mask;
+
+	worker = kthread_create(threadfn, data, "%s-%d", name, current->pid);
+	if (IS_ERR(worker))
+		goto out_free_mask;
+
+	ret = sched_getaffinity(current->pid, mask);
+	if (ret)
+		goto out_stop_worker;
+
+	ret = sched_setaffinity(worker->pid, mask);
+	if (ret)
+		goto out_stop_worker;
+
+	ret = cgroup_attach_task_current_cg(worker);
+	if (ret)
+		goto out_stop_worker;
+
+	return worker;
+
+out_stop_worker:
+	kthread_stop(worker);
+out_free_mask:
+	free_cpumask_var(mask);
+	return ERR_PTR(ret);
+}
+EXPORT_SYMBOL(kthread_create_in_current_cg);
+
 /**
  * kthread_bind - bind a just-created kthread to a cpu.
  * @p: thread created by kthread_create().

Thanks
Sridhar
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help