[PATCH 0/3] IOVA allocation improvements for iommu-dma

From: Sunil Kovvuri <hidden>
Date: 2017-03-16 13:18:57
Also in: linux-iommu

On Wed, Mar 15, 2017 at 7:03 PM, Robin Murphy [off-list ref] wrote:

Hi all,

Here's the first bit of lock contention removal to chew on - feedback
welcome! Note that for the current users of the io-pgtable framework,
this is most likely to simply push more contention onto the io-pgtable
lock, so may not show a great improvement alone. Will and I both have
rough proof-of-concept implementations of lock-free io-pgtable code
which we need to sit down and agree on at some point, hopefullt fairly
soon.

Thanks for working on this.
As you said, it's indeed pushing lock contention down to pgtable lock from
iova rbtree lock but now morethan lock I see issue is with yielding CPU while
waiting for tlb_sync. Below are some numbers.

I have tweaked '__arm_smmu_tlb_sync' in SMMUv2 driver i.e basically removed
cpu_relax() and udelay() to make it a busy loop.

Before: 1.1 Gbps
With your patches: 1.45Gbps
With your patches + busy loop in tlb_sync: 7Gbps

If we reduce pgtable contention a bit
With your patches + busy loop in tlb_sync + Iperf threads reduced to 8
from 16: ~9Gbps

So looks like along with pgtable lock, some optimization can be done
to tlb_sync code as well.

Thanks,
Sunil.

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help