Re: [PATCH v7 RFC 1/3] sparc: Break up monolithic iommu table/lock into finer graularity pools and lock
From: Sowmini Varadhan <hidden>
Date: 2015-03-30 13:01:32
Also in:
sparclinux
From: Sowmini Varadhan <hidden>
Date: 2015-03-30 13:01:32
Also in:
sparclinux
On (03/30/15 21:55), Benjamin Herrenschmidt wrote:
No that's not my point. The lock is only taken for a short time but might still collide, the bouncing in that case will probably (at least that's my feeling) hurt more than help. However, I have another concern with your construct. Essentially you spin looking for an unlocked pool without a cpu_relax. Now it's unlikely but you *can* end up eating cycles, which on a high SMT like POWER8 might mean slowing down the actual owner of the pool lock.
So I tried looking at the code, and perhaps there is some arch-specific subtlety here that I am missing, but where does spin_lock itself do the cpu_relax? afaict, LOCK_CONTENDED() itself does not have this.
What is iperf ? What does that mean "there was lock contention" ? IE, was the overall performance improved or not ? Removing contention by trading it for cache line bouncing will not necessarily help. I'm not saying this is a bad optimisation but it makes the code messy and I think deserves some solid numbers demonstrating its worth.
iperf is a network perf benchmark. I'll try to regenerate some numbers today and get some instrumentation of cache-line bounces vs trylock misses.
There was also an actual bug in the case where you hop'ed to a new pool and forgot the flush.
yes, thanks for catching that, I'll fix that too, of course. --Sowmini