[PATCH 0/3] IOVA allocation improvements for iommu-dma
From: Sunil Kovvuri <hidden>
Date: 2017-03-16 13:18:57
Also in:
linux-iommu
On Wed, Mar 15, 2017 at 7:03 PM, Robin Murphy [off-list ref] wrote:
Hi all, Here's the first bit of lock contention removal to chew on - feedback welcome! Note that for the current users of the io-pgtable framework, this is most likely to simply push more contention onto the io-pgtable lock, so may not show a great improvement alone. Will and I both have rough proof-of-concept implementations of lock-free io-pgtable code which we need to sit down and agree on at some point, hopefullt fairly soon.
Thanks for working on this. As you said, it's indeed pushing lock contention down to pgtable lock from iova rbtree lock but now morethan lock I see issue is with yielding CPU while waiting for tlb_sync. Below are some numbers. I have tweaked '__arm_smmu_tlb_sync' in SMMUv2 driver i.e basically removed cpu_relax() and udelay() to make it a busy loop. Before: 1.1 Gbps With your patches: 1.45Gbps With your patches + busy loop in tlb_sync: 7Gbps If we reduce pgtable contention a bit With your patches + busy loop in tlb_sync + Iperf threads reduced to 8 from 16: ~9Gbps So looks like along with pgtable lock, some optimization can be done to tlb_sync code as well. Thanks, Sunil.