Thread (21 messages) 21 messages, 3 authors, 2015-07-08
STALE3991d

[PATCH 2/9] locking/qrwlock: avoid redundant atomic_add_return on read_lock_slowpath

From: peterz@infradead.org (Peter Zijlstra)
Date: 2015-07-07 21:30:01
Also in: linux-arch

On Tue, Jul 07, 2015 at 01:51:54PM -0400, Waiman Long wrote:
quoted
-	cnts = atomic_add_return(_QR_BIAS,&lock->cnts) - _QR_BIAS;
+	atomic_add(_QR_BIAS,&lock->cnts);
+	cnts = smp_load_acquire((u32 *)&lock->cnts);
 	rspin_until_writer_unlock(lock, cnts);

 	/*
Atomic add in x86 is actually a full barrier too. The performance difference
between "lock add" and "lock xadd" should be minor. The additional load,
however, could potentially cause an additional cacheline load on a contended
lock. So do you see actual performance benefit of this change in ARM?
Yes, atomic_add() does not imply (and does not have) any memory barriers
on ARM.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help