Thread (21 messages) 21 messages, 6 authors, 2016-09-29

Re: [PATCH] mm: warn about allocations which stall for too long

From: Michal Hocko <mhocko@kernel.org>
Date: 2016-09-26 08:17:57
Also in: lkml

On Sat 24-09-16 12:00:07, Tetsuo Handa wrote:
Michal Hocko wrote:
quoted
On Fri 23-09-16 23:36:22, Tetsuo Handa wrote:
quoted
Michal Hocko wrote:
quoted
@@ -3659,6 +3661,15 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 	else
 		no_progress_loops++;
 
+	/* Make sure we know about allocations which stall for too long */
+	if (!(gfp_mask & __GFP_NOWARN) && time_after(jiffies, alloc_start + stall_timeout)) {
Should we check !__GFP_NOWARN ? I think __GFP_NOWARN is likely used with
__GFP_NORETRY, and __GFP_NORETRY is already checked by now.

I think printing warning regardless of __GFP_NOWARN is better because
this check is similar to hungtask warning.
Well, if the user said to not warn we should really obey that. Why would
that matter?
__GFP_NOWARN is defined as "Do not print failure messages when memory
allocation failed". It is not defined as "Do not print OOM killer messages
when OOM killer is invoked". It is undefined that "Do not print stall
messages when memory allocation is stalling".
Which is kind of expected as we warned only about allocation failures up
to now.
If memory allocating threads were blocked on locks instead of doing direct
reclaim, hungtask will be able to find stalling memory allocations without
this change. Since direct reclaim prevents allocating threads from sleeping
for long enough to be warned by hungtask, it is important that this change
shall find allocating threads which cannot be warned by hungtask. That is,
not printing warning messages for __GFP_NOWARN allocation requests looses
the value of this change.
I dunno. If the user explicitly requests to not have allocation warning
then I think we should obey that. But this is not something I would be
really insisting hard. If others think that the check should be dropped
I can live with that.

[...]
quoted
quoted
) rather than by line number, and surround __warn_memalloc_stall() call with
mutex in order to serialize warning messages because it is possible that
multiple allocation requests are stalling?
we do not use any lock in warn_alloc_failed so why this should be any
different?
warn_alloc_failed() is called for both __GFP_DIRECT_RECLAIM and
!__GFP_DIRECT_RECLAIM allocation requests, and it is not allowed
to sleep if !__GFP_DIRECT_RECLAIM. Thus, we have to tolerate that
concurrent memory allocation failure messages make dmesg output
unreadable. But __warn_memalloc_stall() is called for only
__GFP_DIRECT_RECLAIM allocation requests. Thus, we are allowed to
sleep in order to serialize concurrent memory allocation stall
messages.
I still do not see a point. A single line about the warning and locked
dump_stack sounds sufficient to me.

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help