Thread (24 messages) 24 messages, 5 authors, 2023-09-27

Re: [PATCH v12 5/6] iommu/dma: Allow a single FQ in addition to per-CPU FQs

From: Niklas Schnelle <schnelle@linux.ibm.com>
Date: 2023-09-11 21:59:34
Also in: asahi, linux-arm-kernel, linux-arm-msm, linux-doc, linux-iommu, linux-mediatek, linux-s390, linux-sunxi, linux-tegra, lkml

On Fri, 2023-08-25 at 12:11 +0200, Niklas Schnelle wrote:
In some virtualized environments, including s390 paged memory guests,
IOTLB flushes are used to update IOMMU shadow tables. Due to this, they
are much more expensive than in typical bare metal environments or
non-paged s390 guests. In addition they may parallelize poorly in
virtualized environments. This changes the trade off for flushing IOVAs
such that minimizing the number of IOTLB flushes trumps any benefit of
cheaper queuing operations or increased paralellism.

In this scenario per-CPU flush queues pose several problems. Firstly
per-CPU memory is often quite limited prohibiting larger queues.
Secondly collecting IOVAs per-CPU but flushing via a global timeout
reduces the number of IOVAs flushed for each timeout especially on s390
where PCI interrupts may not be bound to a specific CPU.

Let's introduce a single flush queue mode that reuses the same queue
logic but only allocates a single global queue. This mode is selected by
dma-iommu if a newly introduced .shadow_on_flush flag is set in struct
dev_iommu. As a first user the s390 IOMMU driver sets this flag during
probe_device. With the unchanged small FQ size and timeouts this setting
is worse than per-CPU queues but a follow up patch will make the FQ size
and timeout variable. Together this allows the common IOVA flushing code
to more closely resemble the global flush behavior used on s390's
previous internal DMA API implementation.

Link: https://lore.kernel.org/all/9a466109-01c5-96b0-bf03-304123f435ee@arm.com/ (local)
Acked-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> #s390
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
---
 drivers/iommu/dma-iommu.c  | 168 ++++++++++++++++++++++++++++++++++-----------
 drivers/iommu/s390-iommu.c |   3 +
 include/linux/iommu.h      |   2 +
 3 files changed, 134 insertions(+), 39 deletions(-)
---8<---
quoted hunk ↗ jump to hunk
 
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 182cc4c71e62..c3687e066ed7 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -409,6 +409,7 @@ struct iommu_fault_param {
  * @priv:	 IOMMU Driver private data
  * @max_pasids:  number of PASIDs this device can consume
  * @attach_deferred: the dma domain attachment is deferred
+ * @shadow_on_flush: IOTLB flushes are used to sync shadow tables
  *
  * TODO: migrate other per device data pointers under iommu_dev_data, e.g.
  *	struct iommu_group	*iommu_group;
@@ -422,6 +423,7 @@ struct dev_iommu {
 	void				*priv;
 	u32				max_pasids;
 	u32				attach_deferred:1;
+	u32				shadow_on_flush:1;
This causes a merge conflict with a48ce36e2786f ("iommu: Prevent
RESV_DIRECT devices from blocking domains"), The resolution is trivial
though in that shadow_on_flush:1 can just be added after (or before)
require_direct:1. @Joro do you want me to sent a version with this
resolution regardless or will you resolve this when applying?
 };
 
 int iommu_device_register(struct iommu_device *iommu,
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help