Re: [PATCH v3 03/20] mm: let _folio_nr_pages overlay memcg_data in first tail page
From: David Hildenbrand <hidden>
Date: 2025-03-05 10:29:59
Also in:
cgroups, linux-doc, linux-fsdevel, linux-mm, lkml
Subsystem:
memory management - core, the rest · Maintainers:
Andrew Morton, David Hildenbrand, Linus Torvalds
On 03.03.25 17:29, David Hildenbrand wrote:
quoted hunk ↗ jump to hunk
Let's free up some more of the "unconditionally available on 64BIT" space in order-1 folios by letting _folio_nr_pages overlay memcg_data in the first tail page (second folio page). Consequently, we have the optimization now whenever we have CONFIG_MEMCG, independent of 64BIT. We have to make sure that page->memcg on tail pages does not return "surprises". page_memcg_check() already properly refuses PageTail(). Let's do that earlier in print_page_owner_memcg() to avoid printing wrong "Slab cache page" information. No other code should touch that field on tail pages of compound pages. Reset the "_nr_pages" to 0 when splitting folios, or when freeing them back to the buddy (to avoid false page->memcg_data "bad page" reports). Note that in __split_huge_page(), folio_nr_pages() would stop working already as soon as we start messing with the subpages. Most kernel configs should have at least CONFIG_MEMCG enabled, even if disabled at runtime. 64byte "struct memmap" is what we usually have on 64BIT. While at it, rename "_folio_nr_pages" to "_nr_pages". Hopefully memdescs / dynamically allocating "strut folio" in the future will further clean this up, e.g., making _nr_pages available in all configs and maybe even in small folios. Doing that should be fairly easy on top of this change. Reviewed-by: Kirill A. Shutemov <redacted> Signed-off-by: David Hildenbrand <redacted> --- include/linux/mm.h | 4 ++-- include/linux/mm_types.h | 30 ++++++++++++++++++++++-------- mm/huge_memory.c | 16 +++++++++++++--- mm/internal.h | 4 ++-- mm/page_alloc.c | 6 +++++- mm/page_owner.c | 2 +- 6 files changed, 45 insertions(+), 17 deletions(-)diff --git a/include/linux/mm.h b/include/linux/mm.h index a743321dc1a5d..694704217df8a 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h@@ -1199,10 +1199,10 @@ static inline unsigned int folio_large_order(const struct folio *folio) return folio->_flags_1 & 0xff; } -#ifdef CONFIG_64BIT +#ifdef NR_PAGES_IN_LARGE_FOLIO static inline long folio_large_nr_pages(const struct folio *folio) { - return folio->_folio_nr_pages; + return folio->_nr_pages; } #else static inline long folio_large_nr_pages(const struct folio *folio)diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 689b2a7461892..e81be20bbabc6 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h@@ -287,6 +287,11 @@ typedef struct { unsigned long val; } swp_entry_t; +#if defined(CONFIG_MEMCG) || defined(CONFIG_SLAB_OBJ_EXT) +/* We have some extra room after the refcount in tail pages. */ +#define NR_PAGES_IN_LARGE_FOLIO +#endif + /** * struct folio - Represents a contiguous set of bytes. * @flags: Identical to the page flags.@@ -312,7 +317,7 @@ typedef struct { * @_large_mapcount: Do not use directly, call folio_mapcount(). * @_nr_pages_mapped: Do not use outside of rmap and debug code. * @_pincount: Do not use directly, call folio_maybe_dma_pinned(). - * @_folio_nr_pages: Do not use directly, call folio_nr_pages(). + * @_nr_pages: Do not use directly, call folio_nr_pages(). * @_hugetlb_subpool: Do not use directly, use accessor in hugetlb.h. * @_hugetlb_cgroup: Do not use directly, use accessor in hugetlb_cgroup.h. * @_hugetlb_cgroup_rsvd: Do not use directly, use accessor in hugetlb_cgroup.h.@@ -377,13 +382,20 @@ struct folio { unsigned long _flags_1; unsigned long _head_1; /* public: */ - atomic_t _large_mapcount; - atomic_t _entire_mapcount; - atomic_t _nr_pages_mapped; - atomic_t _pincount; -#ifdef CONFIG_64BIT - unsigned int _folio_nr_pages; -#endif + union { + struct { + atomic_t _large_mapcount; + atomic_t _entire_mapcount; + atomic_t _nr_pages_mapped; + atomic_t _pincount; + }; + unsigned long _usable_1[4]; + }; + atomic_t _mapcount_1; + atomic_t _refcount_1; +#ifdef NR_PAGES_IN_LARGE_FOLIO + unsigned int _nr_pages; +#endif /* NR_PAGES_IN_LARGE_FOLIO */ /* private: the union with struct page is transitional */
@Andrew The following on top to make htmldoc happy. There will be two conflicts to be resolved in two patches because of the added "/* private: " under "_pincount". It's fairly straight forward to resolve (just keep "/* private:" below any changes), but let me know if I should resend a v4 instead for this. From 9d9ff38c2ea14f1b4dab29f099ec0f6be683f3fe Mon Sep 17 00:00:00 2001 From: David Hildenbrand <redacted> Date: Wed, 5 Mar 2025 11:14:52 +0100 Subject: [PATCH] fixup: mm: let _folio_nr_pages overlay memcg_data in first tail page Make "make htmldoc" happy by marking non-private placeholder entries private. Signed-off-by: David Hildenbrand <redacted> --- include/linux/mm_types.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index e81be20bbabc..6062c12c3871 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h@@ -381,18 +381,20 @@ struct folio { struct { unsigned long _flags_1; unsigned long _head_1; - /* public: */ union { struct { + /* public: */ atomic_t _large_mapcount; atomic_t _entire_mapcount; atomic_t _nr_pages_mapped; atomic_t _pincount; + /* private: the union with struct page is transitional */ }; unsigned long _usable_1[4]; }; atomic_t _mapcount_1; atomic_t _refcount_1; + /* public: */ #ifdef NR_PAGES_IN_LARGE_FOLIO unsigned int _nr_pages; #endif /* NR_PAGES_IN_LARGE_FOLIO */
--
2.48.1
--
Cheers,
David / dhildenb