Re: [PATCH] cgroup/cpuset: fix circular locking dependency
From: Prateek Sood <hidden>
Date: 2017-12-28 20:37:33
Also in:
lkml
On 12/13/2017 09:36 PM, Tejun Heo wrote:
Hello, Prateek. On Wed, Dec 13, 2017 at 01:20:46PM +0530, Prateek Sood wrote:quoted
This change makes the usage of cpuset_hotplug_workfn() from cpu hotplug path synchronous. For memory hotplug it still remains asynchronous.Ah, right.quoted
Memory migration happening from cpuset_hotplug_workfn() is already asynchronous by queuing cpuset_migrate_mm_workfn() in cpuset_migrate_mm_wq. cpuset_hotplug_workfn() cpuset_hotplug_workfn(() cpuset_migrate_mm() queue_work(cpuset_migrate_mm_wq) It seems that memory migration latency might not have impact with this change. Please let me know if you meant something else by cpuset migration taking time when memory migration is turned on.No, I didn't. I was just confused about which part became synchronous. So, I don't have anything against making the cpu part synchronous, but let's not do that as the fix to the deadlocks cuz, while we can avoid them by changing cpuset, I don't think cpuset is the root cause for them. If there are benefits to making cpuset cpu migration synchronous, let's do that for those benefits. Thanks.
TJ, One more deadlock scenario Task: sh [<ffffff874f917290>] wait_for_completion+0x14 [<ffffff874e8a82e0>] cpuhp_kick_ap_work+0x80 //waiting for cpuhp/2 [<ffffff874f913780>] _cpu_down+0xe0 [<ffffff874e8a9934>] cpu_down+0x38 [<ffffff874ef6e4ec>] cpu_subsys_offline+0x10 Task: cpuhp/2 [<ffffff874f91645c>] schedule+0x38 [<ffffff874e92c76c>] _synchronize_rcu_expedited+0x2ec [<ffffff874e92c874>] synchronize_sched_expedited+0x60 [<ffffff874e92c9f8>] synchronize_sched+0xb0 [<ffffff874e9104e4>] sugov_stop+0x58 [<ffffff874f36967c>] cpufreq_stop_governor+0x48 [<ffffff874f36a89c>] cpufreq_offline+0x84 [<ffffff874f36aa30>] cpuhp_cpufreq_offline+0xc [<ffffff874e8a797c>] cpuhp_invoke_callback+0xac [<ffffff874e8a89b4>] cpuhp_down_callbacks+0x58 [<ffffff874e8a95e8>] cpuhp_thread_fun+0xa8 _synchronize_rcu_expedited is waiting for execution of rcu expedited grace period work item wait_rcu_exp_gp() Task: kworker/2:1 [<ffffff874f91645c>] schedule+0x38 [<ffffff874f916870>] schedule_preempt_disabled+0x20 [<ffffff874f918df8>] __mutex_lock_slowpath+0x158 [<ffffff874f919004>] mutex_lock+0x14 [<ffffff874e8a77b0>] get_online_cpus+0x34 //waiting for cpu_hotplug_lock [<ffffff874e96452c>] rebuild_sched_domains+0x30 [<ffffff874e964648>] cpuset_hotplug_workfn+0xb8 [<ffffff874e8c27b8>] process_one_work+0x168 [<ffffff874e8c2ff4>] worker_thread+0x140 [<ffffff874e8c95b8>] kthread+0xe0 cpu_hotplug_lock is acquired by task: sh Task: kworker/2:3 [<ffffff874f91645c>] schedule+0x38 [<ffffff874f91a384>] schedule_timeout+0x1d8 [<ffffff874f9171d4>] wait_for_common+0xb4 [<ffffff874f917304>] wait_for_completion_killable+0x14 //waiting for kthreadd [<ffffff874e8c931c>] __kthread_create_on_node+0xec [<ffffff874e8c9448>] kthread_create_on_node+0x64 [<ffffff874e8c2d88>] create_worker+0xb4 [<ffffff874e8c3194>] worker_thread+0x2e0 [<ffffff874e8c95b8>] kthread+0xe0 Task: kthreadd [<ffffff874e8858ec>] __switch_to+0x94 [<ffffff874f915ed8>] __schedule+0x2a8 [<ffffff874f91645c>] schedule+0x38 [<ffffff874f91a0ec>] rwsem_down_read_failed+0xe8 [<ffffff874e913dc4>] __percpu_down_read+0xfc [<ffffff874e8a4ab0>] copy_process.isra.72.part.73+0xf60 [<ffffff874e8a53b8>] _do_fork+0xc4 [<ffffff874e8a5720>] kernel_thread+0x34 [<ffffff874e8ca83c>] kthreadd+0x144 kthreadd is waiting for cgroup_threadgroup_rwsem acquired by task T Task: T [<ffffff874f91645c>] schedule+0x38 [<ffffff874f916870>] schedule_preempt_disabled+0x20 [<ffffff874f918df8>] __mutex_lock_slowpath+0x158 [<ffffff874f919004>] mutex_lock+0x14 [<ffffff874e962ff4>] cpuset_can_attach+0x58 [<ffffff874e95d640>] cgroup_taskset_migrate+0x8c [<ffffff874e95d9b4>] cgroup_migrate+0xa4 [<ffffff874e95daf0>] cgroup_attach_task+0x100 [<ffffff874e95df28>] __cgroup_procs_write.isra.35+0x228 [<ffffff874e95e00c>] cgroup_tasks_write+0x10 [<ffffff874e958294>] cgroup_file_write+0x44 [<ffffff874eaa4384>] kernfs_fop_write+0xc0 task T is waiting for cpuset_mutex acquired by kworker/2:1 sh ==> cpuhp/2 ==> kworker/2:1 ==> sh kworker/2:3 ==> kthreadd ==> Task T ==> kworker/2:1 It seems that my earlier patch set should fix this scenario: 1) Inverting locking order of cpuset_mutex and cpu_hotplug_lock. 2) Make cpuset hotplug work synchronous. Could you please share your feedback. Thanks -- Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc., is a member of Code Aurora Forum, a Linux Foundation Collaborative Project