Re: [bug report] iommu_dma_unmap_sg() is very slow then running IO from remote numa node
From: Ming Lei <hidden>
Date: 2021-07-09 14:25:21
Also in:
linux-arm-kernel, linux-iommu, lkml
On Fri, Jul 09, 2021 at 11:26:53AM +0100, Robin Murphy wrote:
On 2021-07-09 09:38, Ming Lei wrote:quoted
Hello, I observed that NVMe performance is very bad when running fio on one CPU(aarch64) in remote numa node compared with the nvme pci numa node. Please see the test result[1] 327K vs. 34.9K. Latency trace shows that one big difference is in iommu_dma_unmap_sg(), 1111 nsecs vs 25437 nsecs.Are you able to dig down further into that? iommu_dma_unmap_sg() itself doesn't do anything particularly special, so whatever makes a difference is probably happening at a lower level, and I suspect there's probably an SMMU involved. If for instance it turns out to go all the way down to __arm_smmu_cmdq_poll_until_consumed() because polling MMIO from the wrong node is slow, there's unlikely to be much you can do about that other than the global "go faster" knobs (iommu.strict and iommu.passthrough) with their associated compromises.
Follows the log of 'perf report'
1) good(run fio from cpus in the nvme's numa node)
- 34.86% 1.73% fio [nvme] [k] nvme_process_cq ▒
- 33.13% nvme_process_cq ▒
- 32.93% nvme_pci_complete_rq ▒
- 24.92% nvme_unmap_data ▒
- 20.08% dma_unmap_sg_attrs ▒
- 19.79% iommu_dma_unmap_sg ▒
- 19.55% __iommu_dma_unmap ▒
- 16.86% arm_smmu_iotlb_sync ▒
- 16.81% arm_smmu_tlb_inv_range_domain ▒
- 14.73% __arm_smmu_tlb_inv_range ▒
14.44% arm_smmu_cmdq_issue_cmdlist ▒
0.89% __pi_memset ▒
0.75% arm_smmu_atc_inv_domain ▒
+ 1.58% iommu_unmap_fast ▒
+ 0.71% iommu_dma_free_iova ▒
- 3.25% dma_unmap_page_attrs ▒
- 3.21% iommu_dma_unmap_page ▒
- 3.14% __iommu_dma_unmap_swiotlb ▒
- 2.86% __iommu_dma_unmap ▒
- 2.48% arm_smmu_iotlb_sync ▒
- 2.47% arm_smmu_tlb_inv_range_domain ▒
- 2.19% __arm_smmu_tlb_inv_range ▒
2.16% arm_smmu_cmdq_issue_cmdlist ▒
+ 1.34% mempool_free ▒
+ 7.68% nvme_complete_rq ▒
+ 1.73% _start
2) bad(run fio from cpus not in the nvme's numa node)
- 49.25% 3.03% fio [nvme] [k] nvme_process_cq ▒
- 46.22% nvme_process_cq ▒
- 46.07% nvme_pci_complete_rq ▒
- 41.02% nvme_unmap_data ▒
- 34.92% dma_unmap_sg_attrs ▒
- 34.75% iommu_dma_unmap_sg ▒
- 34.58% __iommu_dma_unmap ▒
- 33.04% arm_smmu_iotlb_sync ▒
- 33.00% arm_smmu_tlb_inv_range_domain ▒
- 31.86% __arm_smmu_tlb_inv_range ▒
31.71% arm_smmu_cmdq_issue_cmdlist ▒
+ 0.90% iommu_unmap_fast ▒
- 5.17% dma_unmap_page_attrs ▒
- 5.15% iommu_dma_unmap_page ▒
- 5.12% __iommu_dma_unmap_swiotlb ▒
- 5.05% __iommu_dma_unmap ▒
- 4.86% arm_smmu_iotlb_sync ▒
- 4.85% arm_smmu_tlb_inv_range_domain ▒
- 4.70% __arm_smmu_tlb_inv_range ▒
4.67% arm_smmu_cmdq_issue_cmdlist ▒
+ 0.74% mempool_free ▒
+ 4.83% nvme_complete_rq ▒
+ 3.03% _start
Thanks,
Ming
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme