Re: [PATCH v2 3/6] sched: make double-lock-balance fair

[PATCH 0/5] sched: misc rt fixes for tip/sched/devel · Gregory Haskins <hidden> · 2008-08-25
[PATCH 1/5] sched: only try to push a task on wakeup if it is migratable · Gregory Haskins <hidden> · 2008-08-25
[PATCH 2/5] sched: pull only one task during NEWIDLE balancing to limit critical section · Gregory Haskins <hidden> · 2008-08-25
Re: [PATCH 2/5] sched: pull only one task during NEWIDLE balancing to limit critical section · Nick Piggin <hidden> · 2008-08-26
Re: [PATCH 2/5] sched: pull only one task during NEWIDLE balancing to limit critical section · Gregory Haskins <hidden> · 2008-08-26
Re: [PATCH 2/5] sched: pull only one task during NEWIDLE balancing to limit critical section · Nick Piggin <hidden> · 2008-08-27
Re: [PATCH 2/5] sched: pull only one task during NEWIDLE balancing to limit critical section · Gregory Haskins <hidden> · 2008-08-27
Re: [PATCH 2/5] sched: pull only one task during NEWIDLE balancing to limit critical section · Nick Piggin <hidden> · 2008-08-27
[PATCH 3/5] sched: make double-lock-balance fair · Gregory Haskins <hidden> · 2008-08-25
Re: [PATCH 3/5] sched: make double-lock-balance fair · Nick Piggin <hidden> · 2008-08-26
Re: [PATCH 3/5] sched: make double-lock-balance fair · Gregory Haskins <hidden> · 2008-08-26
Re: [PATCH 3/5] sched: make double-lock-balance fair · Nick Piggin <hidden> · 2008-08-27
Re: [PATCH 3/5] sched: make double-lock-balance fair · Gregory Haskins <hidden> · 2008-08-27
Re: [PATCH 3/5] sched: make double-lock-balance fair · Nick Piggin <hidden> · 2008-08-27
Re: [PATCH 3/5] sched: make double-lock-balance fair · Gregory Haskins <hidden> · 2008-08-27
[PATCH 4/5] sched: add sched_class->needs_post_schedule() member · Gregory Haskins <hidden> · 2008-08-25
[PATCH 5/5] sched: create "pushable_tasks" list to limit pushing to one attempt · Gregory Haskins <hidden> · 2008-08-25
[PATCH v2 0/6] Series short description · Gregory Haskins <hidden> · 2008-08-26
[PATCH v2 1/6] sched: only try to push a task on wakeup if it is migratable · Gregory Haskins <hidden> · 2008-08-26
[PATCH v2 2/6] sched: pull only one task during NEWIDLE balancing to limit critical section · Gregory Haskins <hidden> · 2008-08-26
[PATCH v2 3/6] sched: make double-lock-balance fair · Gregory Haskins <hidden> · 2008-08-26
Re: [PATCH v2 3/6] sched: make double-lock-balance fair · Peter Zijlstra <peterz@infradead.org> · 2008-08-27
Re: [PATCH v2 3/6] sched: make double-lock-balance fair · Peter Zijlstra <peterz@infradead.org> · 2008-08-27
Re: [PATCH v2 3/6] sched: make double-lock-balance fair · Nick Piggin <hidden> · 2008-08-27
Re: [PATCH v2 3/6] sched: make double-lock-balance fair · Peter Zijlstra <peterz@infradead.org> · 2008-08-27
Re: [PATCH v2 3/6] sched: make double-lock-balance fair · Nick Piggin <hidden> · 2008-08-27
Re: [PATCH v2 3/6] sched: make double-lock-balance fair · Nick Piggin <hidden> · 2008-08-27
Re: [PATCH v2 3/6] sched: make double-lock-balance fair · Gregory Haskins <hidden> · 2008-08-27
Re: [PATCH v2 3/6] sched: make double-lock-balance fair · Peter Zijlstra <peterz@infradead.org> · 2008-08-27
Re: [PATCH v2 3/6] sched: make double-lock-balance fair · Russell King <hidden> · 2008-08-27
Re: [PATCH v2 3/6] sched: make double-lock-balance fair · Gregory Haskins <hidden> · 2008-08-27
Re: [PATCH v2 3/6] sched: make double-lock-balance fair · Ralf Baechle <hidden> · 2008-08-29
Re: [PATCH v2 3/6] sched: make double-lock-balance fair · Gregory Haskins <hidden> · 2008-08-27
Re: [PATCH v2 3/6] sched: make double-lock-balance fair · Gregory Haskins <hidden> · 2008-08-27
[PATCH v2 4/6] sched: add sched_class->needs_post_schedule() member · Gregory Haskins <hidden> · 2008-08-26
[PATCH v2 5/6] plist: fix PLIST_NODE_INIT to work with debug enabled · Gregory Haskins <hidden> · 2008-08-26
[PATCH v2 6/6] sched: create "pushable_tasks" list to limit pushing to one attempt · Gregory Haskins <hidden> · 2008-08-26
Re: [PATCH v2 6/6] sched: create "pushable_tasks" list to limit pushing to one attempt · Gregory Haskins <hidden> · 2008-08-29
Re: [PATCH v2 0/6] sched: misc rt fixes for tip/sched/devel (was: Series short description) · Gregory Haskins <hidden> · 2008-08-26
Re: [PATCH v2 0/6] Series short description · Peter Zijlstra <peterz@infradead.org> · 2008-08-27
[TIP/SCHED/DEVEL PATCH v3 0/6] sched: misc rt fixes · Gregory Haskins <hidden> · 2008-09-04
[TIP/SCHED/DEVEL PATCH v3 1/6] sched: only try to push a task on wakeup if it is migratable · Gregory Haskins <hidden> · 2008-09-04
[TIP/SCHED/DEVEL PATCH v3 2/6] sched: pull only one task during NEWIDLE balancing to limit critical section · Gregory Haskins <hidden> · 2008-09-04
[TIP/SCHED/DEVEL PATCH v3 3/6] sched: make double-lock-balance fair · Gregory Haskins <hidden> · 2008-09-04
[TIP/SCHED/DEVEL PATCH v3 4/6] sched: add sched_class->needs_post_schedule() member · Gregory Haskins <hidden> · 2008-09-04
Re: [TIP/SCHED/DEVEL PATCH v3 4/6] sched: add sched_class->needs_post_schedule() member · Steven Rostedt <rostedt@goodmis.org> · 2008-09-04
Re: [TIP/SCHED/DEVEL PATCH v3 4/6] sched: add sched_class->needs_post_schedule() member · Gregory Haskins <hidden> · 2008-09-04
[TIP/SCHED/DEVEL PATCH v3 5/6] plist: fix PLIST_NODE_INIT to work with debug enabled · Gregory Haskins <hidden> · 2008-09-04
[TIP/SCHED/DEVEL PATCH v3 6/6] sched: create "pushable_tasks" list to limit pushing to one attempt · Gregory Haskins <hidden> · 2008-09-04
Re: [TIP/SCHED/DEVEL PATCH v3 6/6] sched: create "pushable_tasks" list to limit pushing to one attempt · Steven Rostedt <rostedt@goodmis.org> · 2008-09-04
Re: [TIP/SCHED/DEVEL PATCH v3 6/6] sched: create "pushable_tasks" list to limit pushing to one attempt · Gregory Haskins <hidden> · 2008-09-04

From: Nick Piggin <hidden>
Date: 2008-08-27 10:27:00
Also in: lkml

On Wed, Aug 27, 2008 at 10:21:35AM +0200, Peter Zijlstra wrote:

On Tue, 2008-08-26 at 13:35 -0400, Gregory Haskins wrote:

quoted

double_lock balance() currently favors logically lower cpus since they
often do not have to release their own lock to acquire a second lock.
The result is that logically higher cpus can get starved when there is
a lot of pressure on the RQs.  This can result in higher latencies on
higher cpu-ids.

This patch makes the algorithm more fair by forcing all paths to have
to release both locks before acquiring them again.  Since callsites to
double_lock_balance already consider it a potential preemption/reschedule
point, they have the proper logic to recheck for atomicity violations.

Signed-off-by: Gregory Haskins <redacted>
---

 kernel/sched.c |   52 +++++++++++++++++++++++++++++++++++++++++++++-------
 1 files changed, 45 insertions(+), 7 deletions(-)

diff --git a/kernel/sched.c b/kernel/sched.c
index df6b447..850b454 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c

@@ -2782,21 +2782,43 @@ static void double_rq_unlock(struct rq *rq1, struct rq *rq2)
 		__release(rq2->lock);
 }
 
+#ifdef CONFIG_PREEMPT
+
 /*
- * double_lock_balance - lock the busiest runqueue, this_rq is locked already.
+ * fair double_lock_balance: Safely acquires both rq->locks in a fair
+ * way at the expense of forcing extra atomic operations in all
+ * invocations.  This assures that the double_lock is acquired using the
+ * same underlying policy as the spinlock_t on this architecture, which
+ * reduces latency compared to the unfair variant below.  However, it
+ * also adds more overhead and therefore may reduce throughput.
  */
-static int double_lock_balance(struct rq *this_rq, struct rq *busiest)
+static inline int _double_lock_balance(struct rq *this_rq, struct rq *busiest)
+	__releases(this_rq->lock)
+	__acquires(busiest->lock)
+	__acquires(this_rq->lock)
+{
+	spin_unlock(&this_rq->lock);
+	double_rq_lock(this_rq, busiest);
+
+	return 1;
+}

Right - so to belabour Nick's point:

  if (!spin_trylock(&busiest->lock)) {
    spin_unlock(&this_rq->lock);
    double_rq_lock(this_rq, busiest);
  }

might unfairly treat someone who is waiting on this_rq if I understand
it right?

I suppose one could then write it like:

  if (spin_is_contended(&this_rq->lock) || !spin_trylock(&busiest->lock)) {
    spin_unlock(&this_rq->lock);
    double_rq_lock(this_rq, busiest);
  }

But, I'm not sure that's worth the effort at that point..

Yeah, that could work, but hmm it might cause 2 cache coherency transactions
anyway even in the fastpath, so it might even be slower than just unlocking
unconditionally and taking both locks :(

Anyway - I think all this is utterly defeated on CONFIG_PREEMPT by the
spin with IRQs enabled logic in kernel/spinlock.c.

Making this an -rt only patch...

Hmm, and also on x86 with ticket locks we don't spin with preempt or
interrupts enabled any more (although we still do of course on other
architectures)

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help