* Waiman Long [off-list ref] wrote:
I looked at the assembly code in arch/x86/include/asm/rwsem.h. For both
trylocks (read & write), the count is read first before attempting to
lock it. We did the same for all trylock functions in other locks.
Depending on how the trylock is used and how contended the lock is, it
may help or hurt performance. Changing down_read_trylock to do an
unconditional cmpxchg will change the performance profile of existing
code. So I would prefer keeping the current code.
I do notice now that the generic down_write_trylock() code is doing an
unconditional compxchg. So I wonder if we should change it to read the
lock first like other trylocks or just leave it as it is.
No, I think we should instead move the other trylocks to the
try-for-ownership model as well, like Linus suggested.
That's the general assumption we make in locking primitives, that we
optimize for the common, expected case - which would be that the trylock
succeeds, and I don't see why trylock primitives should be different.
In fact I can see more ways for read-for-sharing to perform suboptimally
on larger systems.
Thanks,
Ingo