Re: [PATCH v4 28/29] sched: Free the stack early if CONFIG_THREAD_INFO_IN_TASK
From: Andy Lutomirski <luto@amacapital.net>
Date: 2016-06-27 02:35:34
Also in:
lkml
On Sun, Jun 26, 2016 at 2:55 PM, Andy Lutomirski [off-list ref] wrote:
We currently keep every task's stack around until the task_struct itself is freed. This means that we keep the stack allocation alive for longer than necessary and that, under load, we free stacks in big batches whenever RCU drops the last task reference. Neither of these is good for reuse of cache-hot memory, and freeing in batches prevents us from usefully caching small numbers of vmalloced stacks. On architectures that have thread_info on the stack, we can't easily change this, but on architectures that set THREAD_INFO_IN_TASK, we can free it as soon as the task is dead.
This is broken:
-void free_task(struct task_struct *tsk)
+void release_task_stack(struct task_struct *tsk)
{
account_kernel_stack(tsk, -1);
arch_release_thread_stack(tsk->stack);
free_thread_stack(tsk);
+ tsk->stack = NULL;
+#ifdef CONFIG_VMAP_STACK
+ tsk->stack_vm_area = NULL;
+#endif
+}
+
+void free_task(struct task_struct *tsk)
+{
+#ifndef CONFIG_THREAD_INFO_IN_TASK
+ /*
+ * The task is finally done with both the stack and thread_info,
+ * so free both.
+ */
+ release_task_stack(tsk);
+#else
+ /*
+ * If the task had a separate stack allocation, it should be gone
+ * by now.
+ */
+ WARN_ON_ONCE(tsk->stack);
+#endifWe can get to free_task without first going through TASK_DEAD if we fail to clone(). I'm inclined to make release_task_stack be safe to call more than once and to call it unconditionally in free_task, since doing it without branches (calling release_task_stack in the copy_process failure path) will require more ifdeffery and sounds like more trouble than it's worth. --Andy