Thread (9 messages) 9 messages, 5 authors, 2014-09-22

Re: Racy manipulation of task_struct->flags in cgroups code causes hard to reproduce kernel panics

From: Tetsuo Handa <hidden>
Date: 2014-09-21 05:16:15
Also in: lkml

Tejun Heo wrote:
Hello,

On Sat, Sep 20, 2014 at 01:55:54PM +0800, Zefan Li wrote:
quoted
quoted
Then, what made current->flags to unexpectedly preserve PF_USED_MATH flag?
The user is running cgrulesengd process in order to utilize cpuset cgroup.
Thus, cpuset_update_task_spread_flag() is called when cgrulesengd process
writes someone's pid to /cgroup/cpuset/$group/tasks interface.

cpuset_update_task_spread_flag() is updating other thread's
"struct task_struct"->flags without exclusion control or atomic
operations!

---------- linux-2.6.32-358.23.2.el6/kernel/cpuset.c ----------
300:/*
301: * update task's spread flag if cpuset's page/slab spread flag is set
302: *
303: * Called with callback_mutex/cgroup_mutex held
304: */
305:static void cpuset_update_task_spread_flag(struct cpuset *cs,
306:                                    struct task_struct *tsk)
307:{
308:    if (is_spread_page(cs))
309:            tsk->flags |= PF_SPREAD_PAGE;
310:    else
311:            tsk->flags &= ~PF_SPREAD_PAGE;
312:    if (is_spread_slab(cs))
313:            tsk->flags |= PF_SPREAD_SLAB;
314:    else
315:            tsk->flags &= ~PF_SPREAD_SLAB;
316:}
We should make the updating of this flag atomic.
Ugh, why do we even implement that in cpuset.  This should be handled
by MPOL_INTERLEAVE.  It feels like people have been using cpuset as
the dumpsite that people used w/o thinking much.  Going forward, let's
please confine cpuset to collective cpu and memory affinity
configuration.  It really shouldn't be implementing novel features for
scheduler or mm.

Anyways, yeah, the patch looks correct to me.  Can you please send a
version w/ proper description and sob?
This race condition exists since commit 950592f7b991 ("cpusets: update
tasks' page/slab spread flags in time") (i.e. Linux 2.6.31 and later)
but "struct task_struct"->atomic_flags was added in Linux 3.17.

If use of ->atomic_flags for cpuset is acceptable, how should we fix
older kernels? Backport ->atomic_flags?
Thanks.

-- 
tejun
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help