Thread (33 messages) 33 messages, 5 authors, 2015-08-21

Re: [PATCH 06/10] mm: page_alloc: Distinguish between being unable to sleep, unwilling to unwilling and avoiding waking kswapd

From: Vlastimil Babka <hidden>
Date: 2015-08-19 14:44:45
Also in: lkml
Subsystem: memory management, memory management - page allocator, the rest · Maintainers: Andrew Morton, Vlastimil Babka, Linus Torvalds

On 08/12/2015 12:45 PM, Mel Gorman wrote:
__GFP_WAIT has been used to identify atomic context in callers that hold
spinlocks or are in interrupts. They are expected to be high priority and
have access one of two watermarks lower than "min". __GFP_HIGH users get
access to the first lower watermark and can be called the "high priority
reserve". Atomic users and interrupts access yet another lower watermark
that can be called the "atomic reserve".

Over time, callers had a requirement to not block when fallback options
were available. Some have abused __GFP_WAIT leading to a situation where
an optimisitic allocation with a fallback option can access atomic reserves.

This patch uses __GFP_ATOMIC to identify callers that are truely atomic,
cannot sleep and have no alternative. High priority users continue to use
__GFP_HIGH. __GFP_DIRECT_RECLAIM identifies callers that can sleep and are
willing to enter direct reclaim. __GFP_KSWAPD_RECLAIM to identify callers
that want to wake kswapd for background reclaim. __GFP_WAIT is redefined
as a caller that is willing to enter direct reclaim and wake kswapd for
background reclaim.

This patch then converts a number of sites

o __GFP_ATOMIC is used by callers that are high priority and have memory
  pools for those requests. GFP_ATOMIC uses this flag. Callers with
  interrupts disabled still automatically use the atomic reserves.

o Callers that have a limited mempool to guarantee forward progress use
  __GFP_DIRECT_RECLAIM. bio allocations fall into this category where
  kswapd will still be woken but atomic reserves are not used as there
  is a one-entry mempool to guarantee progress.

o Callers that are checking if they are non-blocking should use the
  helper gfpflags_allows_blocking() where possible. This is because
  checking for __GFP_WAIT as was done historically now can trigger false
  positives. Some exceptions like dm-crypt.c exist where the code intent
  is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to
  flag manipulations.

The key hazard to watch out for is callers that removed __GFP_WAIT and
was depending on access to atomic reserves for inconspicuous reasons.
In some cases it may be appropriate for them to use __GFP_HIGH.

Signed-off-by: Mel Gorman <redacted>
I like the approach, it makes things much clearer. Note that this isn't full
review, just something that crossed my mind.
quoted hunk ↗ jump to hunk
@@ -117,9 +132,9 @@ struct vm_area_struct;
 #define GFP_HIGHUSER	(GFP_USER | __GFP_HIGHMEM)
 #define GFP_HIGHUSER_MOVABLE	(GFP_HIGHUSER | __GFP_MOVABLE)
 #define GFP_IOFS	(__GFP_IO | __GFP_FS)
-#define GFP_TRANSHUGE	(GFP_HIGHUSER_MOVABLE | __GFP_COMP | \
-			 __GFP_NOMEMALLOC | __GFP_NORETRY | __GFP_NOWARN | \
-			 __GFP_NO_KSWAPD)
+#define GFP_TRANSHUGE	((GFP_HIGHUSER_MOVABLE | __GFP_COMP | \
+			 __GFP_NOMEMALLOC | __GFP_NORETRY | __GFP_NOWARN) & \
+			 ~__GFP_KSWAPD_RECLAIM)
Unfortunately this is not as simple for all uses of GFP_TRANSHUGE.
Namely in __alloc_pages_slowpath() the checks could use __GFP_NO_KSWAPD as one
of the distinguishing flags, but to test for lack of __GFP_KSWAPD_RECLAIM, they
should be adjusted in order to be functionally equivalent.
Yes, it would be better if we could get rid of them, but that's out of scope
here. So, something like this?
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index beda417..e019a89 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3073,7 +3073,8 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 		goto got_pg;
 
 	/* Checks for THP-specific high-order allocations */
-	if ((gfp_mask & GFP_TRANSHUGE) == GFP_TRANSHUGE) {
+	if ((gfp_mask & (GFP_TRANSHUGE | __GFP_KSWAPD_RECLAIM))
+							== GFP_TRANSHUGE) {
 		/*
 		 * If compaction is deferred for high-order allocations, it is
 		 * because sync compaction recently failed. If this is the case
@@ -3108,8 +3109,8 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 	 * fault, so use asynchronous memory compaction for THP unless it is
 	 * khugepaged trying to collapse.
 	 */
-	if ((gfp_mask & GFP_TRANSHUGE) != GFP_TRANSHUGE ||
-						(current->flags & PF_KTHREAD))
+	if ((gfp_mask & (GFP_TRANSHUGE | __GFP_KSWAPD_RECLAIM)
+			!= GFP_TRANSHUGE) || (current->flags & PF_KTHREAD))
 		migration_mode = MIGRATE_SYNC_LIGHT;
 
 	/* Try direct reclaim and then allocating */



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help