Thread (43 messages) 43 messages, 5 authors, 2025-08-14

Re: [PATCH v1 00/16] dma-mapping: migrate to physical address-based API

From: Demi Marie Obenour <hidden>
Date: 2025-08-09 16:53:31
Also in: linux-block, linux-doc, linux-iommu, linux-mm, linux-nvme, linux-trace-kernel, lkml, rust-for-linux, virtualization, xen-devel

On 8/9/25 09:34, Jason Gunthorpe wrote:
On Fri, Aug 08, 2025 at 08:51:08PM +0200, Marek Szyprowski wrote:
quoted
First - basing the API on the phys_addr_t.

Page based API had the advantage that it was really hard to abuse it and 
call for something that is not 'a normal RAM'. 
This is not true anymore. Today we have ZONE_DEVICE as a struct page
type with a whole bunch of non-dram sub-types:

enum memory_type {
	/* 0 is reserved to catch uninitialized type fields */
	MEMORY_DEVICE_PRIVATE = 1,
	MEMORY_DEVICE_COHERENT,
	MEMORY_DEVICE_FS_DAX,
	MEMORY_DEVICE_GENERIC,
	MEMORY_DEVICE_PCI_P2PDMA,
};

Few of which are kmappable/page_to_virtable() in a way that is useful
for the DMA API.

DMA API sort of ignores all of this and relies on the caller to not
pass in an incorrect struct page. eg we rely on things like the block
stack to do the right stuff when a MEMORY_DEVICE_PCI_P2PDMA is present
in a bio_vec.

Which is not really fundamentally different from just using
phys_addr_t in the first place.

Sure, this was a stronger argument when this stuff was originally
written, before ZONE_DEVICE was invented.
quoted
I initially though that phys_addr_t based API will somehow simplify
arch specific implementation, as some of them indeed rely on
phys_addr_t internally, but I missed other things pointed by
Robin. Do we have here any alternative?
I think it is less of a code simplification, more as a reduction in
conceptual load. When we can say directly there is no struct page type
anyhwere in the DMA API layers then we only have to reason about
kmap/phys_to_virt compatibly.

This is also a weaker overall requirement than needing an actual
struct page which allows optimizing other parts of the kernel. Like we
aren't forced to create MEMORY_DEVICE_PCI_P2PDMA stuct pages just to
use the dma api.

Again, any place in the kernel we can get rid of struct page the
smoother the road will be for the MM side struct page restructuring.

For example one of the bigger eventual goes here is to make a bio_vec
store phys_addr_t, not struct page pointers.

DMA API is not alone here, we have been de-struct-paging the kernel
for a long time now:

netdev: https://lore.kernel.org/linux-mm/20250609043225.77229-1-byungchul@sk.com/ (local)
slab: https://lore.kernel.org/linux-mm/20211201181510.18784-1-vbabka@suse.cz/ (local)
iommmu: https://lore.kernel.org/all/0-v4-c8663abbb606+3f7-iommu_pages_jgg@nvidia.com/ (local)
page tables: https://lore.kernel.org/linux-mm/20230731170332.69404-1-vishal.moola@gmail.com/ (local)
zswap: https://lore.kernel.org/all/20241216150450.1228021-1-42.hyeyoo@gmail.com/ (local)

With a long term goal that struct page only exists for legacy code,
and is maybe entirely compiled out of modern server kernels.
Why just server kernels?  I suspect client systems actually run
newer kernels than servers do.
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)

Attachments

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help