Re: [PATCH v4 29/35] mm: slub: Move flush_cpu_slab() invocations __free_slab() invocations out of IRQ context
From: Qian Cai <hidden>
Date: 2021-08-09 22:14:03
Also in:
lkml
On 8/9/2021 4:08 PM, Vlastimil Babka wrote:
On 8/9/2021 8:44 PM, Mike Galbraith wrote:quoted
On Mon, 2021-08-09 at 09:41 -0400, Qian Cai wrote:quoted
On 8/5/2021 11:19 AM, Vlastimil Babka wrote:quoted
+static DEFINE_MUTEX(flush_lock); +static DEFINE_PER_CPU(struct slub_flush_work, slub_flush); + static void flush_all(struct kmem_cache *s) { - on_each_cpu_cond(has_cpu_slab, flush_cpu_slab, s, 1); + struct slub_flush_work *sfw; + unsigned int cpu; + + mutex_lock(&flush_lock);Vlastimil, taking the lock here could trigger a warning during memory offline/online due to the locking order: slab_mutex -> flush_lockBugger. That chain ending with cpu_hotplug_lock makes slub_cpu_dead() taking slab_mutex a non-starter for cpu hotplug as well. It's established early by kernel_init_freeable()..kmem_cache_destroy() as well as by slab_mem_going_offline_callback().I suck at reading the lockdep splats, so I don't see yet how the "existing reverse order" occurs - I do understand the order in the "lsbug". What I also wonder is why didn't this occur also in the older RT trees with this patch. I did change the order of locks in flush_all() to take flush_lock first and cpus_read_lock() second, as Cyrill Gorcunov suggested. Would the original order prevent this? Or we would fail anyway because we already took cpus_read_lock() in offline_pages() and now are taking it again - do these nest or not?
"lsbug" is just an user-space tool running workloads like memory offline/online
via sysfs. The splat indicated that the existing locking orders on the running
system saw so far are:
flush_lock -> cpu_hotplug_lock (in #1)
cpu_hotplug_lock -> pck_batch_high_lock (in #2)
pcp_batch_high_lock -> (memory_chain).rwsem (in #3)
(memory_chain).rwsem -> slab_mutex (in #4)
Thus, lockdep inferences that taking flush_lock first could later reaching
slab_mutex. Then, in the commit, memory offline (in #0) started to take the locking
order slab_mutex -> flush_lock. Thus, the potential deadlock warning.