Thread (47 messages) 47 messages, 2 authors, 2025-01-07

Re: [PATCH v4 10/25] mm/mm_init: Move p2pdma page refcount initialisation to p2pdma

From: Alistair Popple <apopple@nvidia.com>
Date: 2024-12-18 22:49:25
Also in: linux-arm-kernel, linux-cxl, linux-doc, linux-ext4, linux-fsdevel, linux-mm, linux-xfs, lkml, nvdimm

On Tue, Dec 17, 2024 at 11:14:42PM +0100, David Hildenbrand wrote:
On 17.12.24 06:12, Alistair Popple wrote:
quoted
Currently ZONE_DEVICE page reference counts are initialised by core
memory management code in __init_zone_device_page() as part of the
memremap() call which driver modules make to obtain ZONE_DEVICE
pages. This initialises page refcounts to 1 before returning them to
the driver.

This was presumably done because it drivers had a reference of sorts
on the page. It also ensured the page could always be mapped with
vm_insert_page() for example and would never get freed (ie. have a
zero refcount), freeing drivers of manipulating page reference counts.
It probably dates back to copying that code from other zone-init code where
we
(a) Treat all available-at-boot memory as allocated before we release it to
the buddy
(b) Treat all hotplugged memory as allocated until we release it to the
buddy
 
Argh, thanks for the background.
As a side note, I'm working on converting (b) -- PageOffline pages -- to
have a refcount of 0 ("frozen").
[...]
quoted
diff --git a/mm/mm_init.c b/mm/mm_init.c
index 24b68b4..f021e63 100644
--- a/mm/mm_init.c
+++ b/mm/mm_init.c
@@ -1017,12 +1017,26 @@ static void __ref __init_zone_device_page(struct page *page, unsigned long pfn,
  	}
  	/*
-	 * ZONE_DEVICE pages are released directly to the driver page allocator
-	 * which will set the page count to 1 when allocating the page.
+	 * ZONE_DEVICE pages other than MEMORY_TYPE_GENERIC and
+	 * MEMORY_TYPE_FS_DAX pages are released directly to the driver page
+	 * allocator which will set the page count to 1 when allocating the
+	 * page.
+	 *
+	 * MEMORY_TYPE_GENERIC and MEMORY_TYPE_FS_DAX pages automatically have
+	 * their refcount reset to one whenever they are freed (ie. after
+	 * their refcount drops to 0).
  	 */
-	if (pgmap->type == MEMORY_DEVICE_PRIVATE ||
-	    pgmap->type == MEMORY_DEVICE_COHERENT)
+	switch (pgmap->type) {
+	case MEMORY_DEVICE_PRIVATE:
+	case MEMORY_DEVICE_COHERENT:
+	case MEMORY_DEVICE_PCI_P2PDMA:
  		set_page_count(page, 0);
+		break;
+
+	case MEMORY_DEVICE_FS_DAX:
+	case MEMORY_DEVICE_GENERIC:
+		break;
+	}
  }
  /*

But that's a bit weird: we call __init_single_page()->init_page_count() to
initialize it to 1, to then set it back to 0.


Maybe we can just pass to __init_single_page() the refcount we want to have
directly? Can be a patch on top of course.
Once the dust settles on this series we won't need the pgmap->type check at
all because all ZONE_DEVICE pages will get an initial count of 0. I have some
follow up clean-ups for after this series is applied (particularly with regards
to pgmap refcounts), so if it's ok I'd rather do this as a follow-up.
Apart from that

Acked-by: David Hildenbrand <redacted>

-- 
Cheers,

David / dhildenb
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help