Thread (95 messages) 95 messages, 8 authors, 2024-10-17

Re: [PATCH v2 2/8] iommu/arm-smmu-v3: Use S2FWB when available

From: Mostafa Saleh <smostafa@google.com>
Date: 2024-09-02 09:32:13
Also in: kvm, linux-acpi, linux-iommu, linux-patches

On Fri, Aug 30, 2024 at 01:40:19PM -0300, Jason Gunthorpe wrote:
On Fri, Aug 30, 2024 at 03:12:54PM +0000, Mostafa Saleh wrote:
quoted
quoted
+	/*
+	 * If for some reason the HW does not support DMA coherency then using
+	 * S2FWB won't work. This will also disable nesting support.
+	 */
+	if (FIELD_GET(IDR3_FWB, reg) &&
+	    (smmu->features & ARM_SMMU_FEAT_COHERENCY))
+		smmu->features |= ARM_SMMU_FEAT_S2FWB;
I think that’s for the SMMU coherency which in theory is not related to the
master which FWB overrides, so this check is not correct.
Yes, I agree, in theory.

However the driver today already links them together:

	case IOMMU_CAP_CACHE_COHERENCY:
		/* Assume that a coherent TCU implies coherent TBUs */
		return master->smmu->features & ARM_SMMU_FEAT_COHERENCY;

So this hunk was a continuation of that design.
quoted
What I meant in the previous thread that we should set FWB only for coherent
masters as (in attach s2):
	if (smmu->features & ARM_SMMU_FEAT_S2FWB && dev_is_dma_coherent(master->dev)
		// set S2FWB in STE
I think as I explained in that thread, it is not really correct
either. There is no reason to block using S2FWB for non-coherent
masters that are not used with VFIO. The page table will still place
the correct memattr according to the IOMMU_CACHE flag, S2FWB just
slightly changes the encoding.
It’s not just the encoding that changes, as
- Without FWB, stage-2 combine attributes
- While with FWB, it overrides them.

So a cacheable mapping in stage-2 can lead to a non-cacheable
(or with different cachableitiy attributes) transaction based on the
input. I am not sure though if there is such case in the kernel.

Also, that logic doesn't only apply to VFIO, but also for stage-2
only SMMUs that use stage-2 for kernel DMA.
For VFIO, non-coherent masters need to be blocked from VFIO entirely
and should never get even be allowed to get here.

If anything should be changed then it would be the above
IOMMU_CAP_CACHE_COHERENCY test, and I don't know if
dev_is_dma_coherent() would be correct there, or if it should do some
ACPI inspection or what.
I agree, I believe that this assumption is not accurate, I am not sure
what is the right approach here, but in concept I think we shouldn’t
enable FWB for non-coherent devices (using dev_is_dma_coherent() or
other check)

Thanks,
Mostafa
So let's drop the above hunk, it already happens implicitly because
VFIO checks it via IOMMU_CAP_CACHE_COHERENCY and it makes more sense
to put the assumption in one place.

Thanks,
Jason
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help