Thread (35 messages) 35 messages, 4 authors, 2025-10-01

Re: [PATCH rfcv2 6/8] iommu/arm-smmu-v3: Populate smmu_domain->invs when attaching masters

From: Jason Gunthorpe <jgg@nvidia.com>
Date: 2025-09-24 21:42:20
Also in: linux-iommu, linux-patches, lkml

On Mon, Sep 08, 2025 at 04:27:00PM -0700, Nicolin Chen wrote:
Update the invs array with the invalidations required by each domain type
during attachment operations.

Only an SVA domain or a paging domain will have an invs array:
 a. SVA domain will add an INV_TYPE_S1_ASID per SMMU and an INV_TYPE_ATS
    per SID

 b. Non-nesting-parent paging domain with no ATS-enabled master will add
    a single INV_TYPE_S1_ASID or INV_TYPE_S2_VMID per SMMU

 c. Non-nesting-parent paging domain with ATS-enabled master(s) will do
    (b) and add an INV_TYPE_ATS per SID

 d. Nesting-parent paging domain will add an INV_TYPE_S2_VMID followed by
    an INV_TYPE_S2_VMID_S1_CLEAR per vSMMU. For an ATS-enabled master, it
    will add an INV_TYPE_ATS_FULL per SID
Just some minor forward looking clarification - this behavior should
be triggered when a nest-parent is attached through the viommu using
a nesting domain with a vSTE.

A nesting-parent that is just normally attached should act like a
normal S2 since it does not and can not have a two stage S1 on top of
it.

We can't quite get there yet until the next series of changing how the
VMID allocation works.
The per-domain invalidation is not needed, until the domain is attached to
a master, i.e. a possible translation request. Giving this clears a way to
allowing the domain to be attached to many SMMUs, and avoids any pointless
invalidation overheads during a teardown if there are no STE/CDs referring
to the domain. This also means, when the last device is detached, the old
domain must flush its ASID or VMID because any iommu_unmap() call after it
wouldn't initiate any invalidation given an empty domain invs array.
Grammar/phrasing in this paragraph
quoted hunk ↗ jump to hunk
Introduce some arm_smmu_invs helper functions for building scratch arrays,
preparing and installing old/new domain's invalidation arrays.

Co-developed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <redacted>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  22 ++
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 312 +++++++++++++++++++-
 2 files changed, 332 insertions(+), 2 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 246c6d84de3ab..e4e0e066108cc 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -678,6 +678,8 @@ struct arm_smmu_inv {
 /**
  * struct arm_smmu_invs - Per-domain invalidation array
  * @num_invs: number of invalidations in the flexible array
+ * @old: flag to synchronize with reader
+ * @rwlock: optional rwlock to fench ATS operations
  * @rcu: rcu head for kfree_rcu()
  * @inv: flexible invalidation array
  *
@@ -703,6 +705,8 @@ struct arm_smmu_inv {
  */
 struct arm_smmu_invs {
 	size_t num_invs;
+	rwlock_t rwlock;
+	u8 old;
 	struct rcu_head rcu;
 	struct arm_smmu_inv inv[];
 };
@@ -714,6 +718,7 @@ static inline struct arm_smmu_invs *arm_smmu_invs_alloc(size_t num_invs)
 	new_invs = kzalloc(struct_size(new_invs, inv, num_invs), GFP_KERNEL);
 	if (!new_invs)
 		return ERR_PTR(-ENOMEM);
+	rwlock_init(&new_invs->rwlock);
 	new_invs->num_invs = num_invs;
 	return new_invs;
 }
Put these and related hunks in the patch adding arm_smmu_invs
quoted hunk ↗ jump to hunk
@@ -1183,8 +1183,11 @@ size_t arm_smmu_invs_unref(struct arm_smmu_invs *invs,
 			i++;
 		} else if (cmp == 0) {
 			/* same item */
-			if (refcount_dec_and_test(&invs->inv[i].users))
+			if (refcount_dec_and_test(&invs->inv[i].users)) {
+				/* Notify the caller about this deletion */
+				refcount_set(&to_unref->inv[j].users, 1);
 				num_dels++;
This is a bit convoluted. Instead of marking the entry and then
iterating the list again just directly call a function to do the
invalidation.
+static inline void arm_smmu_invs_dbg(struct arm_smmu_master *master,
+				     struct arm_smmu_domain *smmu_domain,
+				     struct arm_smmu_invs *invs, char *name)
+{
+	size_t i;
+
+	dev_dbg(master->dev, "domain (type: %x), invs: %s, num_invs: %ld\n",
+		smmu_domain->domain.type, name, invs->num_invs);
+	for (i = 0; i < invs->num_invs; i++) {
+		struct arm_smmu_inv *cur = &invs->inv[i];
+
+		dev_dbg(master->dev,
+			"  entry: inv[%ld], type: %u, id: %u, users: %u\n", i,
+			cur->type, cur->id, refcount_read(&cur->users));
+	}
+}
Move all the debug code to its own commit and don't send it
+static void
+arm_smmu_install_new_domain_invs(struct arm_smmu_attach_state *state)
+{
+	struct arm_smmu_inv_state *invst = &state->new_domain_invst;
+
+	if (!invst->invs_ptr)
+		return;
+
+	rcu_assign_pointer(*invst->invs_ptr, invst->new_invs);
+	/*
+	 * Committed to updating the STE, using the new invalidation array, and
+	 * acquiring any racing IOPTE updates.
+	 */
+	smp_mb();
We are commited to updating the STE. Make the invalidation list
visable to parallel map/unmap threads and acquire any racying IOPTE
updates.
+	kfree_rcu(invst->old_invs, rcu);
+}
+
+/*
+ * When an array entry's users count reaches zero, it means the ASID/VMID is no
+ * longer being invalidated by map/unmap and must be cleaned. The rule is that
+ * all ASIDs/VMIDs not in an invalidation array are left cleared in the IOTLB.
+ */
+static void arm_smmu_invs_flush_iotlb_tags(struct arm_smmu_invs *invs)
+{
+	size_t i;
+
+	for (i = 0; i != invs->num_invs; i++) {
Just remove the loop and accept struct arm_smmu_inv * and call it
directly.
+	if (!new_invs) {
+		size_t new_num = old_invs->num_invs;
+
+		/*
+		 * OOM. Couldn't make a copy. Leave the array unoptimized. But
+		 * trim its size if some tailing entries are marked as trash.
+		 */
+		while (new_num != 0) {
+			if (refcount_read(&old_invs->inv[new_num - 1].users))
+				break;
+			new_num--;
+		}
Would be nicer to have arm_smmu_invs_unref return the new size so we
don't need this loop

Looks Ok to me otherwise

Jason
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help