Re: On migrate_disable() and latencies

From: Peter Zijlstra <peterz@infradead.org>
Date: 2011-07-25 08:26:24
Also in: lkml

On Fri, 2011-07-22 at 17:39 -0700, Paul E. McKenney wrote:

quoted

Therefore the worst case latency is in the order of
max-migrate-disable-period * nr-cpus.

OK, but wouldn't that be the latency as seen be the lowest-priority
task?

It would indeed, the utility loss is added to the preemption cost of the
lower priority tasks.

Or are migrate-disable tasks given preferential treatment?
If not, a prio-99 task would get the same latency either way, right?

Right.

Migration-disable can magnify the latency seen by low-priority tasks, if
I understand correctly.  If you disabled preemption, when a low-priority
task became runnable, it would find an idle CPU.  But with migration
disable, the lowest-priority task might enter a migration-disable region,
then be preempted by a marginally higher-priority task that also enters
a migration-diable region, and is also preempted, and so on.  The
lowest-priority task cannot run on the current CPU because of all
the higher-priority tasks, and cannot migrate due to being in a
migration-disable section.

Exactly so.

In other words, as is often the case, better worst-case service to
the high-priority tasks can multiply the latency seen by the
low-priority tasks.

So is the topic to quantify this?

I suppose it is indeed. Even for the SoftRT case we need to make sure
the total utilization loss is indeed acceptable.

If so, my take is that the latency
to the highest-priority task decreases by an amount roughly equal to
the duration of the longest preempt_disable() region that turned into a
migration-disable region, while that to the lowest-priority task increases
by N-1 times the CPU overhead of the longest migration-disable region,
plus context switches.  (Yes, this is a very crude rule-of-thumb model.
A real model would have much higher mathematics and might use a more
detailed understanding of the workload.)

Or am I misunderstanding how all this works?

No, I think you're gettin' it.

My main worry with all this is that we have these insane long !preempt
regions in mainline that are now !migrate regions, and thus per all the
above we could be looking at a substantial utilization loss.

Alternatively we could all be missing something far more horrid, but
that might just be my paranoia talking.

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help