Thread (76 messages) 76 messages, 8 authors, 2021-08-23

Re: [PATCH v4 29/35] mm: slub: Move flush_cpu_slab() invocations __free_slab() invocations out of IRQ context

From: "Paul E. McKenney" <paulmck@kernel.org>
Date: 2021-08-10 23:53:39
Also in: lkml

On Wed, Aug 11, 2021 at 12:36:00AM +0200, Vlastimil Babka wrote:
On 8/10/2021 10:31 PM, Paul E. McKenney wrote:
quoted
On Tue, Aug 10, 2021 at 01:47:42PM +0200, Mike Galbraith wrote:
quoted
On Tue, 2021-08-10 at 11:03 +0200, Vlastimil Babka wrote:
quoted
On 8/9/21 3:41 PM, Qian Cai wrote:
quoted
quoted
 
+static DEFINE_MUTEX(flush_lock);
+static DEFINE_PER_CPU(struct slub_flush_work, slub_flush);
+
 static void flush_all(struct kmem_cache *s)
 {
-       on_each_cpu_cond(has_cpu_slab, flush_cpu_slab, s, 1);
+       struct slub_flush_work *sfw;
+       unsigned int cpu;
+
+       mutex_lock(&flush_lock);
Vlastimil, taking the lock here could trigger a warning during memory offline/online due to the locking order:

slab_mutex -> flush_lock

[   91.374541] WARNING: possible circular locking dependency detected
[   91.381411] 5.14.0-rc5-next-20210809+ #84 Not tainted
[   91.387149] ------------------------------------------------------
[   91.394016] lsbug/1523 is trying to acquire lock:
[   91.399406] ffff800018e76530 (flush_lock){+.+.}-{3:3}, at: flush_all+0x50/0x1c8
[   91.407425]
               but task is already holding lock:
[   91.414638] ffff800018e48468 (slab_mutex){+.+.}-{3:3}, at: slab_memory_callback+0x44/0x280
[   91.423603]
               which lock already depends on the new lock.
OK, managed to reproduce in qemu and this fixes it for me on top of
next-20210809. Could you test as well, as your testing might be more
comprehensive? I will format is as a fixup for the proper patch in the series then.
As it appeared it should, moving cpu_hotplug_lock outside slab_mutex in
kmem_cache_destroy() on top of that silenced the cpu offline gripe.
And this one got rid of the remainder of the deadlock, but gets me the
splat shown at the end of this message.  So some sort of middle ground
may be needed.

(Same reproducer as in my previous reply to Vlastimil.)

							Thanx, Paul
quoted
---
 mm/slab_common.c |    2 ++
 mm/slub.c        |    2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -502,6 +502,7 @@ void kmem_cache_destroy(struct kmem_cach
 	if (unlikely(!s))
 		return;

+	cpus_read_lock();
 	mutex_lock(&slab_mutex);

 	s->refcount--;
@@ -516,6 +517,7 @@ void kmem_cache_destroy(struct kmem_cach
 	}
 out_unlock:
 	mutex_unlock(&slab_mutex);
+	cpus_read_unlock();
 }
 EXPORT_SYMBOL(kmem_cache_destroy);

--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4234,7 +4234,7 @@ int __kmem_cache_shutdown(struct kmem_ca
 	int node;
 	struct kmem_cache_node *n;

-	flush_all(s);
+	flush_all_cpus_locked(s);
 	/* Attempt to free all objects */
 	for_each_kmem_cache_node(s, node, n) {
 		free_partial(s, n);
[  602.539109] ------------[ cut here ]------------
[  602.539804] WARNING: CPU: 3 PID: 88 at kernel/cpu.c:335 lockdep_assert_cpus_held+0x29/0x30
So this says the assert failed and we don't have the cpus_read_lock(), right, but...
quoted
[  602.540940] Modules linked in:
[  602.541377] CPU: 3 PID: 88 Comm: torture_shutdow Not tainted 5.14.0-rc5-next-20210809+ #3299
[  602.542536] Hardware name: Red Hat KVM/RHEL-AV, BIOS 1.13.0-2.module_el8.5.0+746+bbd5d70c 04/01/2014
[  602.543786] RIP: 0010:lockdep_assert_cpus_held+0x29/0x30
[  602.544524] Code: 00 83 3d 4d f1 a4 01 01 76 0a 8b 05 4d 23 a5 01 85 c0 75 01 c3 be ff ff ff ff 48 c7 c7 b0 86 66 a3 e8 9b 05 c9 00 85 c0 75 ea <0f> 0b c3 0f 1f 40 00 41 57 41 89 ff 41 56 4d 89 c6 41 55 49 89 cd
[  602.547051] RSP: 0000:ffffb382802efdb8 EFLAGS: 00010246
[  602.547783] RAX: 0000000000000000 RBX: ffffa23301a44000 RCX: 0000000000000001
[  602.548764] RDX: 0000000000000001 RSI: ffffffffa335f5c0 RDI: ffffffffa33adbbf[  602.549747] RBP: ffffa23301a44000 R08: ffffa23302810000 R09: 974cf0ba5c48ad3c
[  602.550727] R10: ffffb382802efe78 R11: 0000000000000001 R12: ffffa23301a44000[  602.551709] R13: 00000000000249c0 R14: 00000000ffffffff R15: 0000000fffffffe0
[  602.552694] FS:  0000000000000000(0000) GS:ffffa2331f580000(0000) knlGS:0000000000000000
[  602.553805] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  602.554606] CR2: 0000000000000000 CR3: 0000000017222000 CR4: 00000000000006e0
[  602.555601] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  602.556590] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  602.557585] Call Trace:
[  602.557927]  flush_all_cpus_locked+0x29/0x140
[  602.558535]  __kmem_cache_shutdown+0x26/0x200
[  602.559145]  ? lock_is_held_type+0xd6/0x130
[  602.559739]  ? torture_onoff+0x260/0x260
[  602.560284]  kmem_cache_destroy+0x38/0x110
It should have been taken here. I don't understand. It's as if only the
mm/slub.c was patched by Mike's patch, but mm/slab_common.c not?
You know, you would think that I would have learned how to reliably
apply a patch by now.  Apparently not this morning.

Anyway, right in one!  I will try again with the full patch later.

							Thanx, Paul
quoted
[  602.560859]  rcu_torture_cleanup.cold.36+0x192/0x421
[  602.561539]  ? wait_woken+0x60/0x60
[  602.562035]  ? torture_onoff+0x260/0x260
[  602.562591]  torture_shutdown+0xdd/0x1c0
[  602.563131]  kthread+0x132/0x160
[  602.563592]  ? set_kthread_struct+0x40/0x40
[  602.564172]  ret_from_fork+0x22/0x30
[  602.564696] irq event stamp: 1307
[  602.565161] hardirqs last  enabled at (1315): [<ffffffffa1eddced>] __up_console_sem+0x4d/0x50
[  602.566321] hardirqs last disabled at (1324): [<ffffffffa1eddcd2>] __up_console_sem+0x32/0x50
[  602.567479] softirqs last  enabled at (1304): [<ffffffffa2e00311>] __do_softirq+0x311/0x473
[  602.568616] softirqs last disabled at (1299): [<ffffffffa1e72eb8>] irq_exit_rcu+0xe8/0xf0
[  602.569735] ---[ end trace 26fd643e1df331c9 ]---
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help