Re: [PATCH v4 29/35] mm: slub: Move flush_cpu_slab() invocations... | linux-mm

[PATCH v4 00/35] SLUB: reduce irq disabled scope and make it RT compatible · Vlastimil Babka <hidden> · 2021-08-05
[PATCH v4 01/35] mm, slub: don't call flush_all() from slab_debug_trace_open() · Vlastimil Babka <hidden> · 2021-08-05
[PATCH v4 02/35] mm, slub: allocate private object map for debugfs listings · Vlastimil Babka <hidden> · 2021-08-05
[PATCH v4 03/35] mm, slub: allocate private object map for validate_slab_cache() · Vlastimil Babka <hidden> · 2021-08-05
[PATCH v4 04/35] mm, slub: don't disable irq for debug_check_no_locks_freed() · Vlastimil Babka <hidden> · 2021-08-05
[PATCH v4 05/35] mm, slub: remove redundant unfreeze_partials() from put_cpu_partial() · Vlastimil Babka <hidden> · 2021-08-05
[PATCH v4 06/35] mm, slub: unify cmpxchg_double_slab() and __cmpxchg_double_slab() · Vlastimil Babka <hidden> · 2021-08-05
[PATCH v4 07/35] mm, slub: extract get_partial() from new_slab_objects() · Vlastimil Babka <hidden> · 2021-08-05
[PATCH v4 10/35] mm, slub: restructure new page checks in ___slab_alloc() · Vlastimil Babka <hidden> · 2021-08-05
[PATCH v4 09/35] mm, slub: return slab page from get_partial() and set c->page afterwards · Vlastimil Babka <hidden> · 2021-08-05
[PATCH v4 08/35] mm, slub: dissolve new_slab_objects() into ___slab_alloc() · Vlastimil Babka <hidden> · 2021-08-05
[PATCH v4 15/35] mm, slub: restore irqs around calling new_slab() · Vlastimil Babka <hidden> · 2021-08-05
[PATCH v4 13/35] mm, slub: do initial checks in ___slab_alloc() with irqs enabled · Vlastimil Babka <hidden> · 2021-08-05
Re: [PATCH v4 13/35] mm, slub: do initial checks in ___slab_alloc() with irqs enabled · Vlastimil Babka <hidden> · 2021-08-15
Re: [PATCH v4 13/35] mm, slub: do initial checks in ___slab_alloc() with irqs enabled · Vlastimil Babka <hidden> · 2021-08-15
[PATCH v4 11/35] mm, slub: simplify kmem_cache_cpu and tid setup · Vlastimil Babka <hidden> · 2021-08-05
[PATCH v4 14/35] mm, slub: move disabling irqs closer to get_partial() in ___slab_alloc() · Vlastimil Babka <hidden> · 2021-08-05
[PATCH v4 21/35] mm, slub: call deactivate_slab() without disabling irqs · Vlastimil Babka <hidden> · 2021-08-05
[PATCH v4 17/35] mm, slub: check new pages with restored irqs · Vlastimil Babka <hidden> · 2021-08-05
[PATCH v4 20/35] mm, slub: make locking in deactivate_slab() irq-safe · Vlastimil Babka <hidden> · 2021-08-05
[PATCH v4 27/35] mm, slub: don't disable irqs in slub_cpu_dead() · Vlastimil Babka <hidden> · 2021-08-05
[PATCH v4 25/35] mm, slub: separate detaching of partial list in unfreeze_partials() from unfreezing · Vlastimil Babka <hidden> · 2021-08-05
[PATCH v4 22/35] mm, slub: move irq control into unfreeze_partials() · Vlastimil Babka <hidden> · 2021-08-05
[PATCH v4 16/35] mm, slub: validate slab from partial list or page allocator before making it cpu slab · Vlastimil Babka <hidden> · 2021-08-05
[PATCH v4 19/35] mm, slub: move reset of c->page and freelist out of deactivate_slab() · Vlastimil Babka <hidden> · 2021-08-05
[PATCH v4 31/35] mm, slub: optionally save/restore irqs in slab_[un]lock()/ · Vlastimil Babka <hidden> · 2021-08-05
[PATCH v4 33/35] mm, slub: protect put_cpu_partial() with disabled irqs instead of cmpxchg · Vlastimil Babka <hidden> · 2021-08-05
[PATCH v4 34/35] mm, slub: use migrate_disable() on PREEMPT_RT · Vlastimil Babka <hidden> · 2021-08-05
[PATCH v4 35/35] mm, slub: convert kmem_cpu_slab protection to local_lock · Vlastimil Babka <hidden> · 2021-08-05
Re: [PATCH v4 35/35] mm, slub: convert kmem_cpu_slab protection to local_lock · Sven Eckelmann <sven@narfation.org> · 2021-08-15
Re: [PATCH v4 35/35] mm, slub: convert kmem_cpu_slab protection to local_lock · Vlastimil Babka <hidden> · 2021-08-17
Re: [PATCH v4 35/35] mm, slub: convert kmem_cpu_slab protection to local_lock · Sebastian Andrzej Siewior <bigeasy@linutronix.de> · 2021-08-17
Re: [PATCH v4 35/35] mm, slub: convert kmem_cpu_slab protection to local_lock · Vlastimil Babka <hidden> · 2021-08-17
Re: [PATCH v4 35/35] mm, slub: convert kmem_cpu_slab protection to local_lock · Sebastian Andrzej Siewior <bigeasy@linutronix.de> · 2021-08-17
Re: [PATCH v4 35/35] mm, slub: convert kmem_cpu_slab protection to local_lock · Vlastimil Babka <hidden> · 2021-08-17
Re: [PATCH v4 35/35] mm, slub: convert kmem_cpu_slab protection to local_lock · Sebastian Andrzej Siewior <bigeasy@linutronix.de> · 2021-08-17
Re: [PATCH v4 35/35] mm, slub: convert kmem_cpu_slab protection to local_lock · Vlastimil Babka <hidden> · 2021-08-17
Re: [PATCH v4 35/35] mm, slub: convert kmem_cpu_slab protection to local_lock · Vlastimil Babka <hidden> · 2021-08-17
Re: [PATCH v4 35/35] mm, slub: convert kmem_cpu_slab protection to local_lock · Andrew Morton <akpm@linux-foundation.org> · 2021-08-17
Re: [PATCH v4 35/35] mm, slub: convert kmem_cpu_slab protection to local_lock · Vlastimil Babka <hidden> · 2021-08-18
Re: [PATCH v4 35/35] mm, slub: convert kmem_cpu_slab protection to local_lock · Thomas Gleixner <hidden> · 2021-08-23
Re: [PATCH v4 35/35] mm, slub: convert kmem_cpu_slab protection to local_lock · Sebastian Andrzej Siewior <bigeasy@linutronix.de> · 2021-08-17
Re: [PATCH v4 35/35] mm, slub: convert kmem_cpu_slab protection to local_lock · Vlastimil Babka <hidden> · 2021-08-17
Re: [PATCH v4 35/35] mm, slub: convert kmem_cpu_slab protection to local_lock · Sebastian Andrzej Siewior <bigeasy@linutronix.de> · 2021-08-17
Re: [PATCH v4 35/35] mm, slub: convert kmem_cpu_slab protection to local_lock · Vlastimil Babka <hidden> · 2021-08-17
[PATCH v4 26/35] mm, slub: only disable irq with spin_lock in __unfreeze_partials() · Vlastimil Babka <hidden> · 2021-08-05
[PATCH v4 32/35] mm, slub: make slab_lock() disable irqs with PREEMPT_RT · Vlastimil Babka <hidden> · 2021-08-05
[PATCH v4 29/35] mm: slub: Move flush_cpu_slab() invocations __free_slab() invocations out of IRQ context · Vlastimil Babka <hidden> · 2021-08-05
Re: [PATCH v4 29/35] mm: slub: Move flush_cpu_slab() invocations __free_slab() invocations out of IRQ context · Qian Cai <hidden> · 2021-08-09
Re: [PATCH v4 29/35] mm: slub: Move flush_cpu_slab() invocations __free_slab() invocations out of IRQ context · Mike Galbraith <hidden> · 2021-08-09
Re: [PATCH v4 29/35] mm: slub: Move flush_cpu_slab() invocations __free_slab() invocations out of IRQ context · Vlastimil Babka <hidden> · 2021-08-09
Re: [PATCH v4 29/35] mm: slub: Move flush_cpu_slab() invocations __free_slab() invocations out of IRQ context · Qian Cai <hidden> · 2021-08-09
Re: [PATCH v4 29/35] mm: slub: Move flush_cpu_slab() invocations __free_slab() invocations out of IRQ context · Mike Galbraith <hidden> · 2021-08-10
Re: [PATCH v4 29/35] mm: slub: Move flush_cpu_slab() invocations __free_slab() invocations out of IRQ context · Vlastimil Babka <hidden> · 2021-08-10
Re: [PATCH v4 29/35] mm: slub: Move flush_cpu_slab() invocations __free_slab() invocations out of IRQ context · Mike Galbraith <hidden> · 2021-08-10
Re: [PATCH v4 29/35] mm: slub: Move flush_cpu_slab() invocations __free_slab() invocations out of IRQ context · "Paul E. McKenney" <paulmck@kernel.org> · 2021-08-10
Re: [PATCH v4 29/35] mm: slub: Move flush_cpu_slab() invocations __free_slab() invocations out of IRQ context · Vlastimil Babka <hidden> · 2021-08-10
Re: [PATCH v4 29/35] mm: slub: Move flush_cpu_slab() invocations __free_slab() invocations out of IRQ context · "Paul E. McKenney" <paulmck@kernel.org> · 2021-08-10
Re: [PATCH v4 29/35] mm: slub: Move flush_cpu_slab() invocations __free_slab() invocations out of IRQ context · "Paul E. McKenney" <paulmck@kernel.org> · 2021-08-11
Re: [PATCH v4 29/35] mm: slub: Move flush_cpu_slab() invocations __free_slab() invocations out of IRQ context · "Paul E. McKenney" <paulmck@kernel.org> · 2021-08-10
Re: [PATCH v4 29/35] mm: slub: Move flush_cpu_slab() invocations __free_slab() invocations out of IRQ context · Vlastimil Babka <hidden> · 2021-08-10
Re: [PATCH v4 29/35] mm: slub: Move flush_cpu_slab() invocations __free_slab() invocations out of IRQ context · Qian Cai <hidden> · 2021-08-11
Re: [PATCH v4 29/35] mm: slub: Move flush_cpu_slab() invocations __free_slab() invocations out of IRQ context · Vlastimil Babka <hidden> · 2021-08-11
[PATCH v4 30/35] mm: slub: Make object_map_lock a raw_spinlock_t · Vlastimil Babka <hidden> · 2021-08-05
[PATCH v4 28/35] mm, slab: make flush_slab() possible to call with irqs enabled · Vlastimil Babka <hidden> · 2021-08-05
[PATCH v4 18/35] mm, slub: stop disabling irqs around get_partial() · Vlastimil Babka <hidden> · 2021-08-05
[PATCH v4 24/35] mm, slub: detach whole partial list at once in unfreeze_partials() · Vlastimil Babka <hidden> · 2021-08-05
[PATCH v4 23/35] mm, slub: discard slabs in unfreeze_partials() without irqs disabled · Vlastimil Babka <hidden> · 2021-08-05
[PATCH v4 12/35] mm, slub: move disabling/enabling irqs to ___slab_alloc() · Vlastimil Babka <hidden> · 2021-08-05
Re: [PATCH v4 00/35] SLUB: reduce irq disabled scope and make it RT compatible · Sebastian Andrzej Siewior <bigeasy@linutronix.de> · 2021-08-05
Re: [PATCH v4 00/35] SLUB: reduce irq disabled scope and make it RT compatible · Mike Galbraith <hidden> · 2021-08-06
Re: [PATCH v4 00/35] SLUB: reduce irq disabled scope and make it RT compatible · Vlastimil Babka <hidden> · 2021-08-06
Re: [PATCH v4 00/35] SLUB: reduce irq disabled scope and make it RT compatible · Vlastimil Babka <hidden> · 2021-08-10
Re: [PATCH v4 00/35] SLUB: reduce irq disabled scope and make it RT compatible · Vlastimil Babka <hidden> · 2021-08-15
Re: [PATCH v4 00/35] SLUB: reduce irq disabled scope and make it RT compatible · Vlastimil Babka <hidden> · 2021-08-17
Re: [PATCH v4 00/35] SLUB: reduce irq disabled scope and make it RT compatible · Vlastimil Babka <hidden> · 2021-08-17

Re: [PATCH v4 29/35] mm: slub: Move flush_cpu_slab() invocations __free_slab() invocations out of IRQ context

From: "Paul E. McKenney" <paulmck@kernel.org>
Date: 2021-08-10 23:53:39
Also in: lkml

On Wed, Aug 11, 2021 at 12:36:00AM +0200, Vlastimil Babka wrote:

On 8/10/2021 10:31 PM, Paul E. McKenney wrote:

quoted

On Tue, Aug 10, 2021 at 01:47:42PM +0200, Mike Galbraith wrote:

quoted

On Tue, 2021-08-10 at 11:03 +0200, Vlastimil Babka wrote:

quoted

On 8/9/21 3:41 PM, Qian Cai wrote:

quoted

 
+static DEFINE_MUTEX(flush_lock);
+static DEFINE_PER_CPU(struct slub_flush_work, slub_flush);
+
 static void flush_all(struct kmem_cache *s)
 {
-       on_each_cpu_cond(has_cpu_slab, flush_cpu_slab, s, 1);
+       struct slub_flush_work *sfw;
+       unsigned int cpu;
+
+       mutex_lock(&flush_lock);

Vlastimil, taking the lock here could trigger a warning during memory offline/online due to the locking order:

slab_mutex -> flush_lock

[   91.374541] WARNING: possible circular locking dependency detected
[   91.381411] 5.14.0-rc5-next-20210809+ #84 Not tainted
[   91.387149] ------------------------------------------------------
[   91.394016] lsbug/1523 is trying to acquire lock:
[   91.399406] ffff800018e76530 (flush_lock){+.+.}-{3:3}, at: flush_all+0x50/0x1c8
[   91.407425]
               but task is already holding lock:
[   91.414638] ffff800018e48468 (slab_mutex){+.+.}-{3:3}, at: slab_memory_callback+0x44/0x280
[   91.423603]
               which lock already depends on the new lock.

OK, managed to reproduce in qemu and this fixes it for me on top of
next-20210809. Could you test as well, as your testing might be more
comprehensive? I will format is as a fixup for the proper patch in the series then.

As it appeared it should, moving cpu_hotplug_lock outside slab_mutex in
kmem_cache_destroy() on top of that silenced the cpu offline gripe.

And this one got rid of the remainder of the deadlock, but gets me the
splat shown at the end of this message.  So some sort of middle ground
may be needed.

(Same reproducer as in my previous reply to Vlastimil.)

							Thanx, Paul

quoted

---
 mm/slab_common.c |    2 ++
 mm/slub.c        |    2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

--- a/mm/slab_common.c
+++ b/mm/slab_common.c

@@ -502,6 +502,7 @@ void kmem_cache_destroy(struct kmem_cach
 	if (unlikely(!s))
 		return;

+	cpus_read_lock();
 	mutex_lock(&slab_mutex);

 	s->refcount--;

@@ -516,6 +517,7 @@ void kmem_cache_destroy(struct kmem_cach
 	}
 out_unlock:
 	mutex_unlock(&slab_mutex);
+	cpus_read_unlock();
 }
 EXPORT_SYMBOL(kmem_cache_destroy);

--- a/mm/slub.c
+++ b/mm/slub.c

@@ -4234,7 +4234,7 @@ int __kmem_cache_shutdown(struct kmem_ca
 	int node;
 	struct kmem_cache_node *n;

-	flush_all(s);
+	flush_all_cpus_locked(s);
 	/* Attempt to free all objects */
 	for_each_kmem_cache_node(s, node, n) {
 		free_partial(s, n);

[  602.539109] ------------[ cut here ]------------
[  602.539804] WARNING: CPU: 3 PID: 88 at kernel/cpu.c:335 lockdep_assert_cpus_held+0x29/0x30

So this says the assert failed and we don't have the cpus_read_lock(), right, but...

quoted

[  602.540940] Modules linked in:
[  602.541377] CPU: 3 PID: 88 Comm: torture_shutdow Not tainted 5.14.0-rc5-next-20210809+ #3299
[  602.542536] Hardware name: Red Hat KVM/RHEL-AV, BIOS 1.13.0-2.module_el8.5.0+746+bbd5d70c 04/01/2014
[  602.543786] RIP: 0010:lockdep_assert_cpus_held+0x29/0x30
[  602.544524] Code: 00 83 3d 4d f1 a4 01 01 76 0a 8b 05 4d 23 a5 01 85 c0 75 01 c3 be ff ff ff ff 48 c7 c7 b0 86 66 a3 e8 9b 05 c9 00 85 c0 75 ea <0f> 0b c3 0f 1f 40 00 41 57 41 89 ff 41 56 4d 89 c6 41 55 49 89 cd
[  602.547051] RSP: 0000:ffffb382802efdb8 EFLAGS: 00010246
[  602.547783] RAX: 0000000000000000 RBX: ffffa23301a44000 RCX: 0000000000000001
[  602.548764] RDX: 0000000000000001 RSI: ffffffffa335f5c0 RDI: ffffffffa33adbbf[  602.549747] RBP: ffffa23301a44000 R08: ffffa23302810000 R09: 974cf0ba5c48ad3c
[  602.550727] R10: ffffb382802efe78 R11: 0000000000000001 R12: ffffa23301a44000[  602.551709] R13: 00000000000249c0 R14: 00000000ffffffff R15: 0000000fffffffe0
[  602.552694] FS:  0000000000000000(0000) GS:ffffa2331f580000(0000) knlGS:0000000000000000
[  602.553805] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  602.554606] CR2: 0000000000000000 CR3: 0000000017222000 CR4: 00000000000006e0
[  602.555601] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  602.556590] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  602.557585] Call Trace:
[  602.557927]  flush_all_cpus_locked+0x29/0x140
[  602.558535]  __kmem_cache_shutdown+0x26/0x200
[  602.559145]  ? lock_is_held_type+0xd6/0x130
[  602.559739]  ? torture_onoff+0x260/0x260
[  602.560284]  kmem_cache_destroy+0x38/0x110

It should have been taken here. I don't understand. It's as if only the
mm/slub.c was patched by Mike's patch, but mm/slab_common.c not?

You know, you would think that I would have learned how to reliably
apply a patch by now.  Apparently not this morning.

Anyway, right in one!  I will try again with the full patch later.

							Thanx, Paul

quoted

[  602.560859]  rcu_torture_cleanup.cold.36+0x192/0x421
[  602.561539]  ? wait_woken+0x60/0x60
[  602.562035]  ? torture_onoff+0x260/0x260
[  602.562591]  torture_shutdown+0xdd/0x1c0
[  602.563131]  kthread+0x132/0x160
[  602.563592]  ? set_kthread_struct+0x40/0x40
[  602.564172]  ret_from_fork+0x22/0x30
[  602.564696] irq event stamp: 1307
[  602.565161] hardirqs last  enabled at (1315): [<ffffffffa1eddced>] __up_console_sem+0x4d/0x50
[  602.566321] hardirqs last disabled at (1324): [<ffffffffa1eddcd2>] __up_console_sem+0x32/0x50
[  602.567479] softirqs last  enabled at (1304): [<ffffffffa2e00311>] __do_softirq+0x311/0x473
[  602.568616] softirqs last disabled at (1299): [<ffffffffa1e72eb8>] irq_exit_rcu+0xe8/0xf0
[  602.569735] ---[ end trace 26fd643e1df331c9 ]---

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help