Re: [PATCH v2] cpu/hotplug: wait for cpuset_hotplug_work to finish on cpu onlining
From: Qais Yousef <hidden>
Date: 2021-02-16 18:30:26
Also in:
lkml
On 02/12/21 00:30, Alexey Klimov wrote:
When a CPU offlined and onlined via device_offline() and device_online()
the userspace gets uevent notification. If, after receiving "online" uevent,
userspace executes sched_setaffinity() on some task trying to move it
to a recently onlined CPU, then it often fails with -EINVAL. Userspace needs
to wait around 5..30 ms before sched_setaffinity() will succeed for recently
onlined CPU after receiving uevent.
If in_mask argument for sched_setaffinity() has only recently onlined CPU,
it often fails with such flow:
sched_setaffinity()
cpuset_cpus_allowed()
guarantee_online_cpus() <-- cs->effective_cpus mask does not
contain recently onlined cpu
cpumask_and() <-- final new_mask is empty
__set_cpus_allowed_ptr()
cpumask_any_and_distribute() <-- returns dest_cpu equal to nr_cpu_ids
returns -EINVAL
Cpusets used in guarantee_online_cpus() are updated using workqueue from
cpuset_update_active_cpus() which in its turn is called from cpu hotplug callback
sched_cpu_activate() hence it may not be observable by sched_setaffinity() if
it is called immediately after uevent.nit: newline
Out of line uevent can be avoided if we will ensure that cpuset_hotplug_work has run to completion using cpuset_wait_for_hotplug() after onlining the cpu in cpu_device_up() and in cpuhp_smt_enable(). Co-analyzed-by: Joshua Baker [off-list ref] Signed-off-by: Alexey Klimov <redacted> ---
This looks good to me. Reviewed-by: Qais Yousef <redacted> Thanks