Re: [PATCH v2 03/11] mm/gup: migrate PIN_LONGTERM dev coherent pages to system

[PATCH v2 00/11] Add MEMORY_DEVICE_COHERENT for coherent device memory mapping · Alex Sierra <hidden> · 2021-12-06
[PATCH v2 02/11] mm: add device coherent vma selection for memory migration · Alex Sierra <hidden> · 2021-12-06
[PATCH v2 01/11] mm: add zone device coherent type memory support · Alex Sierra <hidden> · 2021-12-06
[PATCH v2 04/11] drm/amdkfd: add SPM support for SVM · Alex Sierra <hidden> · 2021-12-06
[PATCH v2 03/11] mm/gup: migrate PIN_LONGTERM dev coherent pages to system · Alex Sierra <hidden> · 2021-12-06
Re: [PATCH v2 03/11] mm/gup: migrate PIN_LONGTERM dev coherent pages to system · Alistair Popple <apopple@nvidia.com> · 2021-12-08
Re: [PATCH v2 03/11] mm/gup: migrate PIN_LONGTERM dev coherent pages to system · Jason Gunthorpe <jgg@ziepe.ca> · 2021-12-08
Re: [PATCH v2 03/11] mm/gup: migrate PIN_LONGTERM dev coherent pages to system · Alistair Popple <apopple@nvidia.com> · 2021-12-09
Re: [PATCH v2 03/11] mm/gup: migrate PIN_LONGTERM dev coherent pages to system · Jason Gunthorpe <jgg@ziepe.ca> · 2021-12-09
Re: [PATCH v2 03/11] mm/gup: migrate PIN_LONGTERM dev coherent pages to system · Felix Kuehling <felix.kuehling@amd.com> · 2021-12-08
Re: [PATCH v2 03/11] mm/gup: migrate PIN_LONGTERM dev coherent pages to system · Felix Kuehling <felix.kuehling@amd.com> · 2021-12-08
Re: [PATCH v2 03/11] mm/gup: migrate PIN_LONGTERM dev coherent pages to system · Sierra Guiza, Alejandro (Alex) <hidden> · 2021-12-08
Re: [PATCH v2 03/11] mm/gup: migrate PIN_LONGTERM dev coherent pages to system · Alistair Popple <apopple@nvidia.com> · 2021-12-09
Re: [PATCH v2 03/11] mm/gup: migrate PIN_LONGTERM dev coherent pages to system · Felix Kuehling <felix.kuehling@amd.com> · 2021-12-09
[PATCH v2 05/11] drm/amdkfd: coherent type as sys mem on migration to ram · Alex Sierra <hidden> · 2021-12-06
[PATCH v2 06/11] lib: test_hmm add ioctl to get zone device type · Alex Sierra <hidden> · 2021-12-06
[PATCH v2 07/11] lib: test_hmm add module param for zone device type · Alex Sierra <hidden> · 2021-12-06
[PATCH v2 09/11] tools: update hmm-test to support device coherent type · Alex Sierra <hidden> · 2021-12-06
[PATCH v2 11/11] tools: add hmm gup test for long term pinned device pages · Alex Sierra <hidden> · 2021-12-06
Re: [PATCH v2 11/11] tools: add hmm gup test for long term pinned device pages · Jason Gunthorpe <jgg@nvidia.com> · 2021-12-07
[PATCH v2 10/11] tools: update test_hmm script to support SP config · Alex Sierra <hidden> · 2021-12-06
[PATCH v2 08/11] lib: add support for device coherent type in test_hmm · Alex Sierra <hidden> · 2021-12-06
Re: [PATCH v2 08/11] lib: add support for device coherent type in test_hmm · Liam Howlett <hidden> · 2022-01-03

From: Jason Gunthorpe <jgg@ziepe.ca>
Date: 2021-12-08 13:54:09
Also in: amd-gfx, dri-devel, linux-mm, linux-xfs

On Wed, Dec 08, 2021 at 10:31:58PM +1100, Alistair Popple wrote:

On Tuesday, 7 December 2021 5:52:43 AM AEDT Alex Sierra wrote:

quoted

Avoid long term pinning for Coherent device type pages. This could
interfere with their own device memory manager.
If caller tries to get user device coherent pages with PIN_LONGTERM flag
set, those pages will be migrated back to system memory.

Signed-off-by: Alex Sierra <redacted>
 mm/gup.c | 32 ++++++++++++++++++++++++++++++--
 1 file changed, 30 insertions(+), 2 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index 886d6148d3d0..1572eacf07f4 100644
+++ b/mm/gup.c

@@ -1689,17 +1689,37 @@ struct page *get_dump_page(unsigned long addr)
 #endif /* CONFIG_ELF_CORE */
 
 #ifdef CONFIG_MIGRATION
+static int migrate_device_page(unsigned long address,
+				struct page *page)
+{
+	struct vm_area_struct *vma = find_vma(current->mm, address);
+	struct vm_fault vmf = {
+		.vma = vma,
+		.address = address & PAGE_MASK,
+		.flags = FAULT_FLAG_USER,
+		.pgoff = linear_page_index(vma, address),
+		.gfp_mask = GFP_KERNEL,
+		.page = page,
+	};
+	if (page->pgmap && page->pgmap->ops->migrate_to_ram)
+		return page->pgmap->ops->migrate_to_ram(&vmf);

How does this synchronise against pgmap being released? As I understand things
at this point we're not holding a reference on either the page or pgmap, so
the page and therefore the pgmap may have been freed.

For sure, this can't keep touching the pages[] array after it unpinned
them:

quoted

 	if (gup_flags & FOLL_PIN) {
 		unpin_user_pages(pages, nr_pages);

               ^^^^^^^^^^^^^^^^^^^

quoted

 	} else {
 		for (i = 0; i < nr_pages; i++)
 			put_page(pages[i]);
 	}
+	if (is_device_page(head))
+		return migrate_device_page(start + page_index * PAGE_SIZE, head);

It was safe before this patch as isolate_lru_page(head) has a
get_page() inside.

Also, please try hard not to turn this function into goto spaghetti

I think a similar problem exists for device private fault handling as well and
it has been on my list of things to fix for a while. I think the solution is to
call try_get_page(), except it doesn't work with device pages due to the whole
refcount thing. That issue is blocking a fair bit of work now so I've started
looking into it.

Where is this?

Jason

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help