Thread (74 messages) 74 messages, 6 authors, 2017-01-13

Re: [PATCH] mm/page_alloc: Wait for oom_lock before retrying.

From: Michal Hocko <mhocko@suse.com>
Date: 2016-12-27 10:57:19

On Tue 27-12-16 19:39:28, Tetsuo Handa wrote:
Michal Hocko wrote:
quoted
On Sat 24-12-16 15:25:43, Tetsuo Handa wrote:
[...]
quoted
quoted
Thus, I'm proposing to save CPU time if waiting for the OOM killer/reaper
when direct reclaim did not help.
Which will just move problem somewhere else I am afraid. Now you will
have hundreds of tasks bouncing on the global mutex. That never turned
out to be a good thing in the past and I am worried that it will just
bite us from a different side. What is worse it might hit us in cases
which do actually happen in the real life.

I am not saying that the current code works perfectly when we are
hitting the direct reclaim close to the OOM but improving that requires
much more than slapping a global lock there.
So, we finally agreed that there are problems when we are hitting the direct
reclaim close to the OOM. Good.
There has never been a disagreement here. The point we seem to be
disagreeing is how much those issues you are seeing matter. I do not
consider them top priority because they are not happening in real life
enough.
 
quoted
quoted
quoted
quoted
Then, you call such latency as scheduler's problem?
mutex_lock_killable(&oom_lock) change helps coping with whatever delays
OOM killer/reaper might encounter.
It helps _your_ particular insane workload. I believe you can construct
many others which which would cause a similar problem and the above
suggestion wouldn't help a bit. Until I can see this is easily
triggerable on a reasonably configured system then I am not convinced
we should add more non trivial changes to the oom killer path.
I'm not using root privileges nor realtime priority nor CONFIG_PREEMPT=y.
Why you don't care about the worst situation / corner cases?
I do care about them! I just do not want to put random hacks which might
seem to work on this _particular_ workload while it brings risks for
others. Look, those corner cases you are simulating are _interesting_ to
see how robust we are but they are no way close to what really happens
in the real life out there - we call those situations DoS from any
practical POV. Admins usually do everything to prevent from them by
configuring their systems and limiting untrusted users as much as
possible.
I wonder why you introduce "untrusted users" concept. From my experience,
there was no "untrusted users". All users who use their systems are trusted
and innocent, but they _by chance_ hit problems when close to (or already)
the OOM.
my experience is that innocent users are no way close to what you are
simulating. And we tend to handle most OOMs just fine in my experience.
 
[...]
quoted
Just try to remember how you were pushing really hard for oom timeouts
one year back because the OOM killer was suboptimal and could lockup. It
took some redesign and many changes to fix that. The result is
imho a better, more predictable and robust code which wouldn't be the
case if we just went your way to have a fix quickly...
I agree that the result is good for users who can update kernels. But that
change was too large to backport. Any approach which did not in time for
customers' deadline of deciding their kernels to use for 10 years is
useless for them. Lack of catch-all reporting/triggering mechanism is
unhappy for both customers and troubleshooting staffs at support centers.
Then implement whatever you find appropriate on those old kernels and
deal with the follow up reports. This is the fair deal you have cope
with when using and supporting old kernels.
 
Improving the direct reclaim close to the OOM requires a lot of effort.
We might add new bugs during that effort. So, where is valid reason that
we can not have asynchronous watchdog like kmallocwd? Please do explain
at kmallocwd thread. You have never persuaded me about keeping kmallocwd
out of tree.
I am not going to repeat my arguments over again. I haven't nacked that
patch and it seems there is no great interest in it so do not try to
claim that it is me who is blocking this feature. I just do not think it
is worth it.

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help