Re: [PATCH v3 1/2] qspinlock: Introducing a 4-byte queue spinlock implementation

[PATCH v3 0/2] qspinlock: Introducing a 4-byte queue spinlock · Waiman Long <hidden> · 2014-01-28
[PATCH v3 1/2] qspinlock: Introducing a 4-byte queue spinlock implementation · Waiman Long <hidden> · 2014-01-28
Re: [PATCH v3 1/2] qspinlock: Introducing a 4-byte queue spinlock implementation · Andi Kleen <hidden> · 2014-01-29
Re: [PATCH v3 1/2] qspinlock: Introducing a 4-byte queue spinlock implementation · George Spelvin <hidden> · 2014-01-29
Re: [PATCH v3 1/2] qspinlock: Introducing a 4-byte queue spinlock implementation · Waiman Long <hidden> · 2014-01-29
Re: [PATCH v3 1/2] qspinlock: Introducing a 4-byte queue spinlock implementation · Rik van Riel <hidden> · 2014-01-30
Re: [PATCH v3 1/2] qspinlock: Introducing a 4-byte queue spinlock implementation · Tim Chen <hidden> · 2014-01-30
Re: [PATCH v3 1/2] qspinlock: Introducing a 4-byte queue spinlock implementation · Peter Zijlstra <peterz@infradead.org> · 2014-01-30
Re: [PATCH v3 1/2] qspinlock: Introducing a 4-byte queue spinlock implementation · Tim Chen <hidden> · 2014-01-30
Re: [PATCH v3 1/2] qspinlock: Introducing a 4-byte queue spinlock implementation · Waiman Long <hidden> · 2014-01-31
Re: [PATCH v3 1/2] qspinlock: Introducing a 4-byte queue spinlock implementation · George Spelvin <hidden> · 2014-01-31
Re: [PATCH v3 1/2] qspinlock: Introducing a 4-byte queue spinlock implementation · Waiman Long <hidden> · 2014-01-31
Re: [PATCH v3 1/2] qspinlock: Introducing a 4-byte queue spinlock implementation · Peter Zijlstra <peterz@infradead.org> · 2014-01-31
Re: [PATCH v3 1/2] qspinlock: Introducing a 4-byte queue spinlock implementation · Waiman Long <hidden> · 2014-01-31
Re: [PATCH v3 1/2] qspinlock: Introducing a 4-byte queue spinlock implementation · Peter Zijlstra <peterz@infradead.org> · 2014-01-30
Re: [PATCH v3 1/2] qspinlock: Introducing a 4-byte queue spinlock implementation · Waiman Long <hidden> · 2014-01-31
Re: [PATCH v3 1/2] qspinlock: Introducing a 4-byte queue spinlock implementation · Peter Zijlstra <peterz@infradead.org> · 2014-01-31
Re: [PATCH v3 1/2] qspinlock: Introducing a 4-byte queue spinlock implementation · Waiman Long <hidden> · 2014-01-31
Re: [PATCH v3 1/2] qspinlock: Introducing a 4-byte queue spinlock implementation · Peter Zijlstra <peterz@infradead.org> · 2014-01-31
Re: [PATCH v3 1/2] qspinlock: Introducing a 4-byte queue spinlock implementation · Peter Zijlstra <peterz@infradead.org> · 2014-02-03
Re: [PATCH v3 1/2] qspinlock: Introducing a 4-byte queue spinlock implementation · Waiman Long <hidden> · 2014-02-06
Re: [PATCH v3 1/2] qspinlock: Introducing a 4-byte queue spinlock implementation · Paul E. McKenney <hidden> · 2014-02-07
Re: [PATCH v3 1/2] qspinlock: Introducing a 4-byte queue spinlock implementation · Raghavendra K T <hidden> · 2014-02-03
[PATCH v3 2/2] qspinlock, x86: Enable x86-64 to use queue spinlock · Waiman Long <hidden> · 2014-01-28
Re: [PATCH v3 2/2] qspinlock, x86: Enable x86-64 to use queue spinlock · Rik van Riel <hidden> · 2014-01-30
Re: [PATCH v3 0/2] qspinlock: Introducing a 4-byte queue spinlock · Raghavendra K T <hidden> · 2014-01-30
Re: [PATCH v3 0/2] qspinlock: Introducing a 4-byte queue spinlock · Waiman Long <hidden> · 2014-01-30
Re: [PATCH v3 0/2] qspinlock: Introducing a 4-byte queue spinlock · Raghavendra K T <hidden> · 2014-01-30
Re: [PATCH v3 0/2] qspinlock: Introducing a 4-byte queue spinlock · Raghavendra K T <hidden> · 2014-02-03
Re: [PATCH v3 0/2] qspinlock: Introducing a 4-byte queue spinlock · Waiman Long <hidden> · 2014-02-06

From: Peter Zijlstra <peterz@infradead.org>
Date: 2014-01-31 19:46:21
Also in: lkml

On Fri, Jan 31, 2014 at 01:26:29PM -0500, Waiman Long wrote:

quoted

I don't get why we need the used thing at all; something like:

struct qna {
int cnt;
struct qnode nodes[4];
};

DEFINE_PER_CPU(struct qna, qna);

struct qnode *get_qnode(void)
{
struct qna *qna = this_cpu_ptr(&qna);

return qna->nodes[qna->cnt++]; /* RMW */
}

void put_qnode(struct qnode *qnode)
{
struct qna *qna = this_cpu_ptr(&qna);
qna->cnt--;
}

Should do fine, right?

Yes, we can do something like that. However I think put_qnode() needs to use
atomic dec as well. As a result, we will need 2 additional atomic operations
per slowpath invocation. The code may look simpler, but I don't think it
will be faster than what I am currently doing as the cases where the used
flag is set will be relatively rare.

No, put doesn't need an atomic; nor is it as well; because the inc
doesn't need an atomic either.

quoted

If we interrupt the RMW above the interrupted context hasn't yet used
the queue and once we return its free again, so all should be well even
on load-store archs.

The nodes array might as well be 3, because NMIs should never contend on
a spinlock, so all we're left with is task, softirq and hardirq context.

I am not so sure about NMI not taking a spinlock. I seem to remember seeing
code that did that. Actually, I think the NMI code is trying to printk
something which, in turn, need to acquire a spinlock.

Yeah I know, terribly broken that, I've been waiting for that to explode
:-)

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help