Re: [PATCH v3] cpu/hotplug: wait for cpuset_hotplug_work to finish on cpu onlining
From: Daniel Jordan <hidden>
Date: 2021-03-18 19:29:26
Also in:
lkml
Alexey Klimov [off-list ref] writes:
When a CPU offlined and onlined via device_offline() and device_online()
the userspace gets uevent notification. If, after receiving "online" uevent,
userspace executes sched_setaffinity() on some task trying to move it
to a recently onlined CPU, then it sometimes fails with -EINVAL. Userspace
needs to wait around 5..30 ms before sched_setaffinity() will succeed for
recently onlined CPU after receiving uevent.
If in_mask argument for sched_setaffinity() has only recently onlined CPU,
it could fail with such flow:
sched_setaffinity()
cpuset_cpus_allowed()
guarantee_online_cpus() <-- cs->effective_cpus mask does not
contain recently onlined cpu
cpumask_and() <-- final new_mask is empty
__set_cpus_allowed_ptr()
cpumask_any_and_distribute() <-- returns dest_cpu equal to nr_cpu_ids
returns -EINVAL
Cpusets used in guarantee_online_cpus() are updated using workqueue from
cpuset_update_active_cpus() which in its turn is called from cpu hotplug callback
sched_cpu_activate() hence it may not be observable by sched_setaffinity() if
it is called immediately after uevent.
Out of line uevent can be avoided if we will ensure that cpuset_hotplug_work
has run to completion using cpuset_wait_for_hotplug() after onlining the
cpu in cpu_device_up() and in cpuhp_smt_enable().
Cc: Daniel Jordan <redacted>
Reviewed-by: Qais Yousef <redacted>
Co-analyzed-by: Joshua Baker [off-list ref]
Signed-off-by: Alexey Klimov <redacted>Looks good to me. Reviewed-by: Daniel Jordan <redacted>