Thread (92 messages) 92 messages, 13 authors, 2023-07-26

Re: Memory providers multiplexing (Was: [PATCH net-next v4 4/5] page_pool: remove PP_FLAG_PAGE_FRAG flag)

From: Jason Gunthorpe <jgg@ziepe.ca>
Date: 2023-07-03 21:43:23
Also in: linux-arm-kernel, linux-mediatek, linux-rdma, linux-wireless, lkml

On Sun, Jul 02, 2023 at 11:22:33PM -0700, Mina Almasry wrote:
On Sun, Jul 2, 2023 at 9:20 PM David Ahern [off-list ref] wrote:
quoted
On 6/29/23 8:27 PM, Mina Almasry wrote:
quoted
Hello Jakub, I'm looking into device memory (peer-to-peer) networking
actually, and I plan to pursue using the page pool as a front end.

Quick description of what I have so far:
current implementation uses device memory with struct pages; I am
putting all those pages in a gen_pool, and we have written an
allocator that allocates pages from the gen_pool. In the driver, we
use this allocator instead of alloc_page() (the driver in question is
gve which currently doesn't use the page pool). When the driver is
done with the p2p page, it simply decrements the refcount on it and
the page is freed back to the gen_pool.
Quick update here, I was able to get my implementation working with
the page pool as a front end with the memory provider API Jakub wrote
here:
https://github.com/kuba-moo/linux/tree/pp-providers

The main complication indeed was the fact that my device memory pages
are ZONE_DEVICE pages, which are incompatible with the page_pool due
to the union in struct page. I thought of a couple of approaches to
resolve that.

1. Make my device memory pages non-ZONE_DEVICE pages. 
Hard no on this from a mm perspective.. We need P2P memory to be
properly tagged and have the expected struct pages to be DMA mappable
and otherwise, you totally break everything if you try to do this..
2. Convert the pages from ZONE_DEVICE pages to page_pool pages and
vice versa as they're being inserted and removed from the page pool.
This is kind of scary, it is very, very, fragile to rework the pages
like this. Eg what happens when the owning device unplugs and needs to
revoke these pages? I think it would likely crash..

I think it also technically breaks the DMA API as we may need to look
into the pgmap to do cache ops on some architectures.

I suggest you try to work with 8k folios and then the tail page's
struct page is empty enough to store the information you need..
Or allocate per page memory and do a memdesc like thing..

Though overall, you won't find devices creating struct pages for their
P2P memory today, so I'm not sure what the purpose is. Jonathan
already got highly slammed for proposing code to the kernel that was
unusable. Please don't repeat that. Other than a special NVMe use case
the interface for P2P is DMABUF right now and it is not struct page
backed.

Even if we did get to struct pages for device memory, it is highly
likely cases you are interested in will be using larger than 4k
folios, so page pool would need to cope with this nicely as well.

Jason
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help