Re: [External] Re: [PATCH v3 05/21] mm/hugetlb: Introduce pgtable allocation/freeing helpers
From: Muchun Song <hidden>
Date: 2020-11-11 03:42:20
Also in:
linux-fsdevel, linux-mm, lkml
On Wed, Nov 11, 2020 at 8:47 AM Mike Kravetz [off-list ref] wrote:
On 11/8/20 6:10 AM, Muchun Song wrote:quoted
On x86_64, vmemmap is always PMD mapped if the machine has hugepages support and if we have 2MB contiguos pages and PMD aligned. If we want to free the unused vmemmap pages, we have to split the huge pmd firstly. So we should pre-allocate pgtable to split PMD to PTE.diff --git a/mm/hugetlb.c b/mm/hugetlb.c index a0007902fafb..5c7be2ee7e15 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c@@ -1303,6 +1303,108 @@ static inline void destroy_compound_gigantic_page(struct page *page,...quoted
+static inline void vmemmap_pgtable_init(struct page *page) +{ + page_huge_pte(page) = NULL; +} + +static void vmemmap_pgtable_deposit(struct page *page, pgtable_t pgtable) +{ + /* FIFO */ + if (!page_huge_pte(page)) + INIT_LIST_HEAD(&pgtable->lru); + else + list_add(&pgtable->lru, &page_huge_pte(page)->lru); + page_huge_pte(page) = pgtable; +} + +static pgtable_t vmemmap_pgtable_withdraw(struct page *page) +{ + pgtable_t pgtable; + + /* FIFO */ + pgtable = page_huge_pte(page); + page_huge_pte(page) = list_first_entry_or_null(&pgtable->lru, + struct page, lru); + if (page_huge_pte(page)) + list_del(&pgtable->lru); + return pgtable; +} + +static int vmemmap_pgtable_prealloc(struct hstate *h, struct page *page) +{ + int i; + pgtable_t pgtable; + unsigned int nr = pgtable_pages_to_prealloc_per_hpage(h); + + if (!nr) + return 0; + + vmemmap_pgtable_init(page); + + for (i = 0; i < nr; i++) { + pte_t *pte_p; + + pte_p = pte_alloc_one_kernel(&init_mm); + if (!pte_p) + goto out; + vmemmap_pgtable_deposit(page, virt_to_page(pte_p)); + } + + return 0; +out: + while (i-- && (pgtable = vmemmap_pgtable_withdraw(page))) + pte_free_kernel(&init_mm, page_to_virt(pgtable)); + return -ENOMEM; +} + +static void vmemmap_pgtable_free(struct hstate *h, struct page *page) +{ + pgtable_t pgtable; + unsigned int nr = pgtable_pages_to_prealloc_per_hpage(h); + + if (!nr) + return; + + pgtable = page_huge_pte(page); + if (!pgtable) + return; + + while (nr-- && (pgtable = vmemmap_pgtable_withdraw(page))) + pte_free_kernel(&init_mm, page_to_virt(pgtable)); +}I may be confused. In patch 9 of this series, the first call to vmemmap_pgtable_free() is made:quoted
@@ -1645,6 +1799,10 @@ void free_huge_page(struct page *page) static void prep_new_huge_page(struct hstate *h, struct page *page, int nid) { + free_huge_page_vmemmap(h, page); + /* Must be called before the initialization of @page->lru */ + vmemmap_pgtable_free(h, page); + INIT_LIST_HEAD(&page->lru); set_compound_page_dtor(page, HUGETLB_PAGE_DTOR); set_hugetlb_cgroup(page, NULL);When I saw that comment in previous patch series, I assumed page->lru was being used to store preallocated pages and pages to free. However, unless
Yeah, you are right.
I am reading the code incorrectly it does not appear page->lru (of the huge page) is being used for this purpose. Is that correct? If it is correct, would using page->lru of the huge page make this code simpler? I am just missing the reason why you are using page_huge_pte(page)->lru
For 1GB HugeTLB pages, we should pre-allocate more than one page
table. So I use a linked list. The page_huge_pte(page) is the list head.
Because the page->lru shares storage with page->pmd_huge_pte.
+ /* Must be called before the initialization of @page->lru */
+ vmemmap_pgtable_free(h, page);
+
INIT_LIST_HEAD(&page->lru);
Here we initialize the lru. So the vmemmap_pgtable_free should
be called before this. It seems like that I should point out this "share"
in the comment.
Thanks.
-- Mike Kravetz
-- Yours, Muchun