Re: [PATCH rfcv2 6/8] iommu/arm-smmu-v3: Populate smmu_domain->invs when attaching masters
From: Jason Gunthorpe <jgg@nvidia.com>
Date: 2025-09-24 21:42:20
Also in:
linux-iommu, linux-patches, lkml
On Mon, Sep 08, 2025 at 04:27:00PM -0700, Nicolin Chen wrote:
Update the invs array with the invalidations required by each domain type
during attachment operations.
Only an SVA domain or a paging domain will have an invs array:
a. SVA domain will add an INV_TYPE_S1_ASID per SMMU and an INV_TYPE_ATS
per SID
b. Non-nesting-parent paging domain with no ATS-enabled master will add
a single INV_TYPE_S1_ASID or INV_TYPE_S2_VMID per SMMU
c. Non-nesting-parent paging domain with ATS-enabled master(s) will do
(b) and add an INV_TYPE_ATS per SID
d. Nesting-parent paging domain will add an INV_TYPE_S2_VMID followed by
an INV_TYPE_S2_VMID_S1_CLEAR per vSMMU. For an ATS-enabled master, it
will add an INV_TYPE_ATS_FULL per SIDJust some minor forward looking clarification - this behavior should be triggered when a nest-parent is attached through the viommu using a nesting domain with a vSTE. A nesting-parent that is just normally attached should act like a normal S2 since it does not and can not have a two stage S1 on top of it. We can't quite get there yet until the next series of changing how the VMID allocation works.
The per-domain invalidation is not needed, until the domain is attached to a master, i.e. a possible translation request. Giving this clears a way to allowing the domain to be attached to many SMMUs, and avoids any pointless invalidation overheads during a teardown if there are no STE/CDs referring to the domain. This also means, when the last device is detached, the old domain must flush its ASID or VMID because any iommu_unmap() call after it wouldn't initiate any invalidation given an empty domain invs array.
Grammar/phrasing in this paragraph
quoted hunk ↗ jump to hunk
Introduce some arm_smmu_invs helper functions for building scratch arrays, preparing and installing old/new domain's invalidation arrays. Co-developed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Nicolin Chen <redacted> --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 22 ++ drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 312 +++++++++++++++++++- 2 files changed, 332 insertions(+), 2 deletions(-)diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index 246c6d84de3ab..e4e0e066108cc 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h@@ -678,6 +678,8 @@ struct arm_smmu_inv { /** * struct arm_smmu_invs - Per-domain invalidation array * @num_invs: number of invalidations in the flexible array + * @old: flag to synchronize with reader + * @rwlock: optional rwlock to fench ATS operations * @rcu: rcu head for kfree_rcu() * @inv: flexible invalidation array *@@ -703,6 +705,8 @@ struct arm_smmu_inv { */ struct arm_smmu_invs { size_t num_invs; + rwlock_t rwlock; + u8 old; struct rcu_head rcu; struct arm_smmu_inv inv[]; };@@ -714,6 +718,7 @@ static inline struct arm_smmu_invs *arm_smmu_invs_alloc(size_t num_invs) new_invs = kzalloc(struct_size(new_invs, inv, num_invs), GFP_KERNEL); if (!new_invs) return ERR_PTR(-ENOMEM); + rwlock_init(&new_invs->rwlock); new_invs->num_invs = num_invs; return new_invs; }
Put these and related hunks in the patch adding arm_smmu_invs
quoted hunk ↗ jump to hunk
@@ -1183,8 +1183,11 @@ size_t arm_smmu_invs_unref(struct arm_smmu_invs *invs, i++; } else if (cmp == 0) { /* same item */ - if (refcount_dec_and_test(&invs->inv[i].users)) + if (refcount_dec_and_test(&invs->inv[i].users)) { + /* Notify the caller about this deletion */ + refcount_set(&to_unref->inv[j].users, 1); num_dels++;
This is a bit convoluted. Instead of marking the entry and then iterating the list again just directly call a function to do the invalidation.
+static inline void arm_smmu_invs_dbg(struct arm_smmu_master *master,
+ struct arm_smmu_domain *smmu_domain,
+ struct arm_smmu_invs *invs, char *name)
+{
+ size_t i;
+
+ dev_dbg(master->dev, "domain (type: %x), invs: %s, num_invs: %ld\n",
+ smmu_domain->domain.type, name, invs->num_invs);
+ for (i = 0; i < invs->num_invs; i++) {
+ struct arm_smmu_inv *cur = &invs->inv[i];
+
+ dev_dbg(master->dev,
+ " entry: inv[%ld], type: %u, id: %u, users: %u\n", i,
+ cur->type, cur->id, refcount_read(&cur->users));
+ }
+}Move all the debug code to its own commit and don't send it
+static void
+arm_smmu_install_new_domain_invs(struct arm_smmu_attach_state *state)
+{
+ struct arm_smmu_inv_state *invst = &state->new_domain_invst;
+
+ if (!invst->invs_ptr)
+ return;
+
+ rcu_assign_pointer(*invst->invs_ptr, invst->new_invs);
+ /*
+ * Committed to updating the STE, using the new invalidation array, and
+ * acquiring any racing IOPTE updates.
+ */
+ smp_mb();We are commited to updating the STE. Make the invalidation list visable to parallel map/unmap threads and acquire any racying IOPTE updates.
+ kfree_rcu(invst->old_invs, rcu);
+}
+
+/*
+ * When an array entry's users count reaches zero, it means the ASID/VMID is no
+ * longer being invalidated by map/unmap and must be cleaned. The rule is that
+ * all ASIDs/VMIDs not in an invalidation array are left cleared in the IOTLB.
+ */
+static void arm_smmu_invs_flush_iotlb_tags(struct arm_smmu_invs *invs)
+{
+ size_t i;
+
+ for (i = 0; i != invs->num_invs; i++) {Just remove the loop and accept struct arm_smmu_inv * and call it directly.
+ if (!new_invs) {
+ size_t new_num = old_invs->num_invs;
+
+ /*
+ * OOM. Couldn't make a copy. Leave the array unoptimized. But
+ * trim its size if some tailing entries are marked as trash.
+ */
+ while (new_num != 0) {
+ if (refcount_read(&old_invs->inv[new_num - 1].users))
+ break;
+ new_num--;
+ }Would be nicer to have arm_smmu_invs_unref return the new size so we don't need this loop Looks Ok to me otherwise Jason