Thread (14 messages) 14 messages, 2 authors, 2015-02-26

Re: [RFC][PATCH v2] sched/rt: Use IPI to trigger RT task push migration instead of pulling

From: Peter Zijlstra <peterz@infradead.org>
Date: 2015-02-26 07:49:25
Also in: lkml

On Wed, Feb 25, 2015 at 12:50:15PM -0500, Steven Rostedt wrote:
quoted
Well, the problem with it is one of collisions. So the 'easy' solution I
proposed would be something like:

int ips_next(struct ipi_pull_struct *ips)
{
	int cpu = ips->src_cpu;
	cpu = cpumask_next(cpu, rto_mask);
	if (cpu >= nr_cpu_ids) {
Do we really need to loop? Just start with the first one, and go to the
end.
quoted
		cpu = 0;
		ips->flags |= IPS_LOOPED;
		cpu = cpumask_next(cpu, rto_mask);
		if (cpu >= nr_cpu_ids) /* empty mask *;
			return cpu;
	}
	if (ips->flags & IPS_LOOPED && cpu >= ips->stop_cpu)
		return nr_cpu_ids;
	return cpu;
}
Yes, notice that we don't start iterating at the beginning; this in on
purpose. If we start iterating at the beginning, _every_ cpu will again
pile up on the first one.

By starting at the current cpu, each cpu will start iteration some place
else and hopefully, with a big enough system, different CPUs end up on a
different rto cpu.
quoted

	struct ipi_pull_struct *ips = __this_cpu_ptr(ips);

	raw_spin_lock(&ips->lock);
	if (ips->flags & IPS_BUSY) {
		/* there is an IPI active; update state */
		ips->dst_prio = current->prio;
		ips->stop_cpu = ips->src_cpu;
		ips->flags &= ~IPS_LOOPED;
I guess the loop is needed for continuing the work, in case the
scheduling changed?
That too.
quoted
	} else {
		/* no IPI active, make one go */
		ips->dst_cpu = smp_processor_id();
		ips->dst_prio = current->prio;
		ips->src_cpu = ips->dst_cpu;
		ips->stop_cpu = ips->dst_cpu;
		ips->flags = IPS_BUSY;

		cpu = ips_next(ips);
		ips->src_cpu = cpu;
		if (cpu < nr_cpu_ids)
			irq_work_queue_on(&ips->work, cpu);
	}
	raw_spin_unlock(&ips->lock);
I'll have to spend some time comprehending this.
:-)
quoted
Where you would simply start walking the RTO mask from the current
position -- it also includes some restart logic, and you'd only take
ips->lock when your ipi handler starts and when it needs to migrate to
another cpu.

This way, on big systems, there's at least some chance different CPUs
find different targets to pull from.
OK, makes sense. I can try that.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help