Re: [PATCH RT] sched: migrate_enable: Busy loop until the migration request is completed
From: Scott Wood <hidden>
Date: 2020-01-22 21:13:35
Also in:
lkml
On Fri, 2019-12-13 at 09:14 +0100, Sebastian Andrzej Siewior wrote:
On 2019-12-13 00:44:22 [-0600], Scott Wood wrote:quoted
quoted
@@ -8239,7 +8239,10 @@ void migrate_enable(void) stop_one_cpu_nowait(task_cpu(p), migration_cpu_stop, &arg, &work); __schedule(true); - WARN_ON_ONCE(!arg.done && !work.disabled); + if (!work.disabled) { + while (!arg.done) + cpu_relax(); + }We should enable preemption while spinning -- besides the general badness of spinning with it disabled, there could be deadlock scenarios if multiple CPUs are spinning in such a loop. Long term maybe have a way to dequeue the no-longer-needed work instead of waiting.Hmm. My plan was to use per-CPU memory and spin before the request is enqueued if the previous isn't done yet (which should not happen™).
Either it can't happen (and thus no need to spin) or it can, and we need to worry about deadlocks if we're spinning with preemption disabled. In fact a deadlock is guaranteed if we're spinning with preemption disabled on the cpu that's supposed to be running the stopper we're waiting on. I think you're right that it can't happen though (as long as we queue it before enabling preemption, the stopper will be runnable and nothing else can run on the cpu before the queue gets drained), so we can just make it a warning. I'm testing a patch now.
Then we could remove __schedule() here and rely on preempt_enable() doing that.
We could do that regardless. -Scott