Thread (15 messages) 15 messages, 4 authors, 2016-07-18

Re: bug in memcg oom-killer results in a hung syscall in another process in the same cgroup

From: Michal Hocko <mhocko@kernel.org>
Date: 2016-07-13 08:09:29
Also in: linux-mm, lkml

On Tue 12-07-16 08:35:06, Shayan Pooya wrote:
quoted
quoted
With strace, when running 500 concurrent mem-hog tasks on the same
kernel, 33 of them failed with:

strace: ../sysdeps/nptl/fork.c:136: __libc_fork: Assertion
`THREAD_GETMEM (self, tid) != ppid' failed.

Which is: https://sourceware.org/bugzilla/show_bug.cgi?id=15392
And discussed before at: https://lkml.org/lkml/2015/2/6/470 but that
patch was not accepted.
OK, so the problem is that the oom killed task doesn't report the futex
release properly? If yes then I fail to see how that is memcg specific.
Could you try to clarify what you consider a bug again, please? I am not
really sure I understand this report.
It looks like it is just a very easy way to reproduce the problem that
Konstantin described in that lkml thread. That patch was not accepted
and I see no other fixes for that issue upstream. Here is a copy of
his root-cause analysis from said thread:

Whole sequence looks like: task calls fork, glibc calls syscall clone with
CLONE_CHILD_SETTID and passes pointer to TLS THREAD_SELF->tid as argument.
Child task gets read-only copy of VM including TLS. Child calls put_user()
to handle CLONE_CHILD_SETTID from schedule_tail(). put_user() trigger page
fault and it fails because do_wp_page()  hits memcg limit without invoking
OOM-killer because this is page-fault from kernel-space.  Put_user returns
-EFAULT, which is ignored.  Child returns into user-space and catches here
assert (THREAD_GETMEM (self, tid) != ppid), glibc tries to print something
but hangs on deadlock on internal locks. Halt and catch fire.
OK, I see! Thanks for the clarification. So the bug is that put_user
return value is ignored. Let's see whether Konstantin's patch will be
accepted or Oleg comes with something else.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help