Re: [PATCH 0/8] dma-mapping: migrate to physical address-based API
From: Leon Romanovsky <leon@kernel.org>
Date: 2025-07-30 13:40:33
Also in:
linux-doc, linux-iommu, linux-mm, linuxppc-dev, lkml, virtualization
On Wed, Jul 30, 2025 at 12:11:32PM +0100, Robin Murphy wrote:
On 2025-07-08 11:27 am, Marek Szyprowski wrote:quoted
On 30.06.2025 15:38, Christoph Hellwig wrote:quoted
On Fri, Jun 27, 2025 at 08:02:13PM +0300, Leon Romanovsky wrote:quoted
quoted
Thanks for this rework! I assume that the next step is to add map_phys callback also to the dma_map_ops and teach various dma-mapping providers to use it to avoid more phys-to-page-to-phys conversions.Probably Christoph will say yes, however I personally don't see any benefit in this. Maybe I wrong here, but all existing .map_page() implementation platforms don't support p2p anyway. They won't benefit from this such conversion.I think that conversion should eventually happen, and rather sooner than later.Agreed. Applied patches 1-7 to my dma-mapping-next branch. Let me know if one needs a stable branch with it.As the maintainer of iommu-dma, please drop the iommu-dma patch because it is broken. It does not in any way remove the struct page dependency from iommu-dma, it merely hides it so things can crash more easily in circumstances that clearly nobody's bothered to test.quoted
Leon, it would be great if You could also prepare an incremental patch adding map_phys callback to the dma_maps_ops, so the individual arch-specific dma-mapping providers can be then converted (or simplified in many cases) too.Marek, I'm surprised that even you aren't seeing why that would at best be pointless churn. The fundamental design of dma_map_page() operating on struct page is that it sits in between alloc_pages() at the caller and kmap_atomic() deep down in the DMA API implementation (which also subsumes any dependencies on having a kernel virtual address at the implementation end). The natural working unit for whatever replaces dma_map_page() will be whatever the replacement for alloc_pages() returns, and the replacement for kmap_atomic() operates on. Until that exists (and I simply cannot believe it would be an unadorned physical address) there cannot be any *meaningful* progress made towards removing the struct page dependency from the DMA API. If there is also a goal to kill off highmem before then, then logically we should just wait for that to land, then revert back to dma_map_single() being the first-class interface, and dma_map_page() can turn into a trivial page_to_virt() wrapper for the long tail of caller conversions. Simply obfuscating the struct page dependency today by dressing it up as a phys_addr_t with implicit baggage is not not in any way helpful. It only makes the code harder to understand and more bug-prone. Despite the disingenuous claims, it is quite blatantly the opposite of "efficient" for callers to do extra work to throw away useful information with page_to_phys(), and the implementation then have to re-derive that information with pfn_valid()/phys_to_page(). And by "bug-prone" I also include greater distractions like this misguided idea that the same API could somehow work for non-memory addresses too, so then everyone can move on bikeshedding VFIO while overlooking the fundamental flaws in the whole premise. I mean, besides all the issues I've already pointed out in that regard, not least the glaring fact that it's literally just a worse version of *an API we already have*, as DMA API maintainer do you *really* approve of a design that depends on callers abusing DMA_ATTR_SKIP_CPU_SYNC, yet will still readily blow up if they did then call a dma_sync op?
Robin, Marek I would like to ask you to do not drop this series and allow me to gradually change the code during my VFIO DMABUF adventure. The most reasonable way to prevent DMA_ATTR_SKIP_CPU_SYNC leakage is to introduce new DMA attribute (let's call it DMA_ATTR_MMIO for now) and pass it to both dma_map_phys() and dma_iova_link(). This flag will indicate that p2p type is PCI_P2PDMA_MAP_THRU_HOST_BRIDGE and call to right callbacks which will set IOMMU_MMIO flag and skip CPU sync, dma_map_phys() isn't entirely wrong, it just needs an extra tweaks. Thanks
Thanks, Robin.