[PATCH v6 04/46] percpu_rwlock: Implement the core design of Per-CPU Reader-Writer Locks
From: Srivatsa S. Bhat <hidden>
Date: 2013-02-18 18:16:55
Also in:
linux-arch, linux-pm, linuxppc-dev, lkml, netdev
On 02/18/2013 11:37 PM, Michel Lespinasse wrote:
On Tue, Feb 19, 2013 at 1:56 AM, Srivatsa S. Bhat [off-list ref] wrote:quoted
On 02/18/2013 09:51 PM, Srivatsa S. Bhat wrote:quoted
On 02/18/2013 09:15 PM, Michel Lespinasse wrote:quoted
I don't see anything preventing a race with the corresponding code in percpu_write_unlock() that sets writer_signal back to false. Did I miss something here ? It seems to me we don't have any guarantee that all writer signals will be set to true at the end of the loop...Ah, thanks for pointing that out! IIRC Oleg had pointed this issue in the last version, but back then, I hadn't fully understood what he meant. Your explanation made it clear. I'll work on fixing this.We can fix this by using the simple patch (untested) shown below. The alternative would be to acquire the rwlock for write, update the ->writer_signal values, release the lock, wait for readers to switch, again acquire the rwlock for write with interrupts disabled etc... which makes it kinda messy, IMHO. So I prefer the simple version shown below.Looks good. Another alternative would be to make writer_signal an atomic integer instead of a bool. That way writers can increment it before locking and decrement it while unlocking.
Yep, that would also do. But the spinlock version looks simpler - no need to check if the atomic counter is non-zero, no need to explicitly spin in a tight-loop etc.
To reduce the number of atomic ops during writer lock/unlock, the writer_signal could also be a global read_mostly variable (I don't see any downsides to that compared to having it percpu - or is it because you wanted all the fastpath state to be in one single cacheline ?)
Yes, we (Oleg and I) debated for a while about global vs percpu, and then finally decided to go with percpu to have cache benefits. Regards, Srivatsa S. Bhat