Re: REGRESSION: 37f4a24c2469: blk-mq: centralise related handling into blk_mq_get_driver_tag
From: Roman Gushchin <hidden>
Date: 2020-09-26 06:43:01
Also in:
linux-ext4, linux-mm, lkml
On Sat, Sep 26, 2020 at 09:43:25AM +0800, Ming Lei wrote:
On Fri, Sep 25, 2020 at 12:19:02PM -0700, Shakeel Butt wrote:quoted
On Fri, Sep 25, 2020 at 10:58 AM Shakeel Butt [off-list ref] wrote:quoted
[snip]quoted
I don't think you can ignore the flushing. The __free_once() in ___cache_free() assumes there is a space available. BTW do_drain() also have the same issue. Why not move slabs_destroy() after we update ac->avail and memmove()?Ming, can you please try the following patch? From: Shakeel Butt <redacted> [PATCH] mm: slab: fix potential infinite recursion in ___cache_free With the commit 10befea91b61 ("mm: memcg/slab: use a single set of kmem_caches for all allocations"), it becomes possible to call kfree() from the slabs_destroy(). However if slabs_destroy() is being called for the array_cache of the local CPU then this opens the potential scenario of infinite recursion because kfree() called from slabs_destroy() can call slabs_destroy() with the same array_cache of the local CPU. Since the array_cache of the local CPU is not updated before calling slabs_destroy(), it will try to free the same pages. To fix the issue, simply update the cache before calling slabs_destroy(). Signed-off-by: Shakeel Butt <redacted> --- mm/slab.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)diff --git a/mm/slab.c b/mm/slab.c index 3160dff6fd76..f658e86ec8ce 100644 --- a/mm/slab.c +++ b/mm/slab.c@@ -1632,6 +1632,10 @@ static void slab_destroy(struct kmem_cache *cachep, struct page *page) kmem_cache_free(cachep->freelist_cache, freelist); } +/* + * Update the size of the caches before calling slabs_destroy as it may + * recursively call kfree. + */ static void slabs_destroy(struct kmem_cache *cachep, struct list_head *list) { struct page *page, *n;@@ -2153,8 +2157,8 @@ static void do_drain(void *arg) spin_lock(&n->list_lock); free_block(cachep, ac->entry, ac->avail, node, &list); spin_unlock(&n->list_lock); - slabs_destroy(cachep, &list); ac->avail = 0; + slabs_destroy(cachep, &list); } static void drain_cpu_caches(struct kmem_cache *cachep)@@ -3402,9 +3406,9 @@ static void cache_flusharray(struct kmem_cache *cachep, struct array_cache *ac) } #endif spin_unlock(&n->list_lock); - slabs_destroy(cachep, &list); ac->avail -= batchcount; memmove(ac->entry, &(ac->entry[batchcount]), sizeof(void *)*ac->avail); + slabs_destroy(cachep, &list); }The issue can't be reproduced after applying this patch: Tested-by: Ming Lei <redacted>
Perfect, thank you very much for the confirmation! Shakeel, can you, please, resend the patch with the proper fixes tag and the updated commit log? Please, feel free to add Reviewed-by: Roman Gushchin <redacted> . Thank you! Roman