Re: [RFC][PATCH] cpu/hotplug: wait for cpuset_hotplug_work to finish on cpu online
From: Alexey Klimov <hidden>
Date: 2021-01-19 18:28:43
Also in:
lkml
On Fri, Jan 15, 2021 at 6:54 AM Daniel Jordan [off-list ref] wrote:
Daniel Jordan [off-list ref] writes:quoted
Peter Zijlstra [off-list ref] writes:quoted
quoted
The nature of this bug is also described here (with different consequences): https://lore.kernel.org/lkml/20200211141554.24181-1-qais.yousef-5wv7dgnIgG8@public.gmane.org/Yeah, pesky deadlocks.. someone was going to try again.I dug up the synchronous patch https://lore.kernel.org/lkml/1579878449-10164-1-git-send-email-prsood-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org/ but surprisingly wasn't able to reproduce the lockdep splat from https://lore.kernel.org/lkml/F0388D99-84D7-453B-9B6B-EEFF0E7BE4CC-J5quhbR+WMc@public.gmane.org/ even though I could hit it a few weeks ago.oh okay, you need to mount a legacy cpuset hierarchy. So as the above splat shows, making cpuset_hotplug_workfn() synchronous means cpu_hotplug_lock (and "cpuhp_state-down") can be acquired before cgroup_mutex. But there are at least four cgroup paths that take the locks in the opposite order. They're all the same, they take cgroup_mutex and then cpu_hotplug_lock later on to modify one or more static keys. cpu_hotplug_lock should probably be ahead of cgroup_mutex because the latter is taken in a hotplug callback, and we should keep the static branches in cgroup, so the only way out I can think of is moving cpu_hotplug_lock to just before cgroup_mutex is taken and switching to _cpuslocked flavors of the static key calls. lockdep quiets down with that change everywhere, but it puts another big lock around a lot of cgroup paths. Seems less heavyhanded to go with this RFC. What do you all think?
Daniel, thank you for taking a look. I don't mind reviewing+testing another approach that you described.
Absent further discussion, Alexey, do you plan to post another version?
I plan to update this patch and re-send in the next couple of days. It looks like it might be a series of two patches. Sorry for delays. Best regards, Alexey