Thread (38 messages) 38 messages, 9 authors, 2026-01-23

Re: [PATCH v6 1/5] mm/zone_device: Reinitialize large zone device private folios

From: Matthew Brost <matthew.brost@intel.com>
Date: 2026-01-16 20:31:34
Also in: amd-gfx, dri-devel, intel-xe, kvm, linux-cxl, linux-mm, lkml, nouveau

On Fri, Jan 16, 2026 at 08:17:22PM +0100, Vlastimil Babka wrote:
On 1/16/26 18:49, Jason Gunthorpe wrote:
quoted
On Fri, Jan 16, 2026 at 12:10:16PM +0100, Francois Dugast wrote:
quoted
-void zone_device_page_init(struct page *page, unsigned int order)
+void zone_device_page_init(struct page *page, struct dev_pagemap *pgmap,
+			   unsigned int order)
 {
+	struct page *new_page = page;
+	unsigned int i;
+
 	VM_WARN_ON_ONCE(order > MAX_ORDER_NR_PAGES);
 
+	for (i = 0; i < (1UL << order); ++i, ++new_page) {
+		struct folio *new_folio = (struct folio *)new_page;
+
+		/*
+		 * new_page could have been part of previous higher order folio
+		 * which encodes the order, in page + 1, in the flags bits. We
+		 * blindly clear bits which could have set my order field here,
+		 * including page head.
+		 */
+		new_page->flags.f &= ~0xffUL;	/* Clear possible order, page head */
+
+#ifdef NR_PAGES_IN_LARGE_FOLIO
+		/*
+		 * This pointer math looks odd, but new_page could have been
+		 * part of a previous higher order folio, which sets _nr_pages
+		 * in page + 1 (new_page). Therefore, we use pointer casting to
+		 * correctly locate the _nr_pages bits within new_page which
+		 * could have modified by previous higher order folio.
+		 */
+		((struct folio *)(new_page - 1))->_nr_pages = 0;
+#endif
This seems too weird, why is it in the loop?  There is only one
_nr_pages per folio.
I suppose we could be getting say an order-9 folio that was previously used
as two order-8 folios? And each of them had their _nr_pages in their head
Yes, this is a good example. At this point we have idea what previous
allocation(s) order(s) were - we could have multiple places in the loop
where _nr_pages is populated, thus we have to clear this everywhere. 
and we can't know that at this point so we have to reset everything?
Yes, see above, correct. We have no visablity to previous state of the
pages so the only option is to reset everything.
AFAIU this would not be a problem if the clearing of the previous state was
done upon freeing, as e.g. v4 did, but I think you also argued it meant
processing the pages when freeing and then again at reallocation, so it's
now like this instead?
Yes, if we cleanup the previous folio state upon freeing, then this
problem goes away but the we back passing in the order as argument to
->folio_free(). 
Or maybe you mean that stray _nr_pages in some tail page from previous
lifetimes can't affect the current lifetime in a wrong way for something
looking at said page? I don't know immediately.
quoted
This is mostly zeroing some memory in the tail pages? Why?

Why can't this use the normal helpers, like memmap_init_compound()?

 struct folio *new_folio = page

 /* First 4 tail pages are part of struct folio */
 for (i = 4; i < (1UL << order); i++) {
     prep_compound_tail(..)
 }

 prep_comound_head(page, order)
 new_folio->_nr_pages = 0

??
I've beat this to death with Alistair, normal helpers do not work here.

An order zero allocation could have _nr_pages set in its page,
new_folio->_nr_pages is page + 1 memory.

Matt
quoted
Jason
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help