Thread (29 messages) 29 messages, 5 authors, 2020-09-27

Re: REGRESSION: 37f4a24c2469: blk-mq: centralise related handling into blk_mq_get_driver_tag

From: Roman Gushchin <hidden>
Date: 2020-09-26 06:43:01
Also in: linux-ext4, linux-mm, lkml

On Sat, Sep 26, 2020 at 09:43:25AM +0800, Ming Lei wrote:
On Fri, Sep 25, 2020 at 12:19:02PM -0700, Shakeel Butt wrote:
quoted
On Fri, Sep 25, 2020 at 10:58 AM Shakeel Butt [off-list ref]
wrote:
quoted
[snip]
quoted
I don't think you can ignore the flushing. The __free_once() in
___cache_free() assumes there is a space available.

BTW do_drain() also have the same issue.

Why not move slabs_destroy() after we update ac->avail and memmove()?
Ming, can you please try the following patch?


From: Shakeel Butt <redacted>

[PATCH] mm: slab: fix potential infinite recursion in ___cache_free

With the commit 10befea91b61 ("mm: memcg/slab: use a single set of
kmem_caches for all allocations"), it becomes possible to call kfree()
from the slabs_destroy(). However if slabs_destroy() is being called for
the array_cache of the local CPU then this opens the potential scenario
of infinite recursion because kfree() called from slabs_destroy() can
call slabs_destroy() with the same array_cache of the local CPU. Since
the array_cache of the local CPU is not updated before calling
slabs_destroy(), it will try to free the same pages.

To fix the issue, simply update the cache before calling
slabs_destroy().

Signed-off-by: Shakeel Butt <redacted>
---
 mm/slab.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/mm/slab.c b/mm/slab.c
index 3160dff6fd76..f658e86ec8ce 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1632,6 +1632,10 @@ static void slab_destroy(struct kmem_cache *cachep, struct page *page)
 		kmem_cache_free(cachep->freelist_cache, freelist);
 }
 
+/*
+ * Update the size of the caches before calling slabs_destroy as it may
+ * recursively call kfree.
+ */
 static void slabs_destroy(struct kmem_cache *cachep, struct list_head *list)
 {
 	struct page *page, *n;
@@ -2153,8 +2157,8 @@ static void do_drain(void *arg)
 	spin_lock(&n->list_lock);
 	free_block(cachep, ac->entry, ac->avail, node, &list);
 	spin_unlock(&n->list_lock);
-	slabs_destroy(cachep, &list);
 	ac->avail = 0;
+	slabs_destroy(cachep, &list);
 }
 
 static void drain_cpu_caches(struct kmem_cache *cachep)
@@ -3402,9 +3406,9 @@ static void cache_flusharray(struct kmem_cache *cachep, struct array_cache *ac)
 	}
 #endif
 	spin_unlock(&n->list_lock);
-	slabs_destroy(cachep, &list);
 	ac->avail -= batchcount;
 	memmove(ac->entry, &(ac->entry[batchcount]), sizeof(void *)*ac->avail);
+	slabs_destroy(cachep, &list);
 }
The issue can't be reproduced after applying this patch:

Tested-by: Ming Lei <redacted>
Perfect, thank you very much for the confirmation!

Shakeel, can you, please, resend the patch with the proper fixes tag
and the updated commit log? Please, feel free to add
Reviewed-by: Roman Gushchin <redacted> .

Thank you!

Roman
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help