Re: [External] Re: [PATCH v13 04/12] mm: hugetlb: defer freeing of HugeTLB pages

[PATCH v13 00/12] Free some vmemmap pages of HugeTLB page · Muchun Song <hidden> · 2021-01-17
[PATCH v13 01/12] mm: memory_hotplug: factor out bootmem core functions to bootmem_info.c · Muchun Song <hidden> · 2021-01-17
Re: [PATCH v13 01/12] mm: memory_hotplug: factor out bootmem core functions to bootmem_info.c · Miaohe Lin <linmiaohe@huawei.com> · 2021-01-25
[PATCH v13 02/12] mm: hugetlb: introduce a new config HUGETLB_PAGE_FREE_VMEMMAP · Muchun Song <hidden> · 2021-01-17
Re: [PATCH v13 02/12] mm: hugetlb: introduce a new config HUGETLB_PAGE_FREE_VMEMMAP · David Rientjes <rientjes@google.com> · 2021-01-24
Re: [PATCH v13 02/12] mm: hugetlb: introduce a new config HUGETLB_PAGE_FREE_VMEMMAP · Randy Dunlap <hidden> · 2021-01-25
Re: [External] Re: [PATCH v13 02/12] mm: hugetlb: introduce a new config HUGETLB_PAGE_FREE_VMEMMAP · Muchun Song <hidden> · 2021-01-25
Re: [External] Re: [PATCH v13 02/12] mm: hugetlb: introduce a new config HUGETLB_PAGE_FREE_VMEMMAP · Randy Dunlap <hidden> · 2021-01-25
Re: [External] Re: [PATCH v13 02/12] mm: hugetlb: introduce a new config HUGETLB_PAGE_FREE_VMEMMAP · Muchun Song <hidden> · 2021-01-25
Re: [External] Re: [PATCH v13 02/12] mm: hugetlb: introduce a new config HUGETLB_PAGE_FREE_VMEMMAP · David Rientjes <rientjes@google.com> · 2021-01-26
Re: [External] Re: [PATCH v13 02/12] mm: hugetlb: introduce a new config HUGETLB_PAGE_FREE_VMEMMAP · Muchun Song <hidden> · 2021-01-26
Re: [External] Re: [PATCH v13 02/12] mm: hugetlb: introduce a new config HUGETLB_PAGE_FREE_VMEMMAP · David Rientjes <rientjes@google.com> · 2021-01-26
Re: [PATCH v13 02/12] mm: hugetlb: introduce a new config HUGETLB_PAGE_FREE_VMEMMAP · Miaohe Lin <linmiaohe@huawei.com> · 2021-01-26
[PATCH v13 03/12] mm: hugetlb: free the vmemmap pages associated with each HugeTLB page · Muchun Song <hidden> · 2021-01-17
Re: [PATCH v13 03/12] mm: hugetlb: free the vmemmap pages associated with each HugeTLB page · Muchun Song <hidden> · 2021-01-17
Re: [PATCH v13 03/12] mm: hugetlb: free the vmemmap pages associated with each HugeTLB page · Mike Kravetz <hidden> · 2021-01-23
Re: [External] Re: [PATCH v13 03/12] mm: hugetlb: free the vmemmap pages associated with each HugeTLB page · Muchun Song <hidden> · 2021-01-23
Re: [PATCH v13 03/12] mm: hugetlb: free the vmemmap pages associated with each HugeTLB page · Oscar Salvador <osalvador@suse.de> · 2021-01-23
Re: [External] Re: [PATCH v13 03/12] mm: hugetlb: free the vmemmap pages associated with each HugeTLB page · Muchun Song <hidden> · 2021-01-24
[PATCH v13 06/12] mm: hugetlb: set the PageHWPoison to the raw error page · Muchun Song <hidden> · 2021-01-17
Re: [PATCH v13 06/12] mm: hugetlb: set the PageHWPoison to the raw error page · David Rientjes <rientjes@google.com> · 2021-01-25
Re: [External] Re: [PATCH v13 06/12] mm: hugetlb: set the PageHWPoison to the raw error page · Muchun Song <hidden> · 2021-01-25
[PATCH v13 09/12] mm: hugetlb: add a kernel parameter hugetlb_free_vmemmap · Muchun Song <hidden> · 2021-01-17
Re: [PATCH v13 09/12] mm: hugetlb: add a kernel parameter hugetlb_free_vmemmap · David Hildenbrand <hidden> · 2021-01-25
Re: [External] Re: [PATCH v13 09/12] mm: hugetlb: add a kernel parameter hugetlb_free_vmemmap · Muchun Song <hidden> · 2021-01-26
Re: [PATCH v13 09/12] mm: hugetlb: add a kernel parameter hugetlb_free_vmemmap · Oscar Salvador <osalvador@suse.de> · 2021-01-26
Re: [External] Re: [PATCH v13 09/12] mm: hugetlb: add a kernel parameter hugetlb_free_vmemmap · Muchun Song <hidden> · 2021-01-26
[PATCH v13 10/12] mm: hugetlb: introduce nr_free_vmemmap_pages in the struct hstate · Muchun Song <hidden> · 2021-01-17
[PATCH v13 07/12] mm: hugetlb: flush work when dissolving a HugeTLB page · Muchun Song <hidden> · 2021-01-17
[PATCH v13 12/12] mm: hugetlb: optimize the code with the help of the compiler · Muchun Song <hidden> · 2021-01-17
[PATCH v13 11/12] mm: hugetlb: gather discrete indexes of tail page · Muchun Song <hidden> · 2021-01-17
[PATCH v13 08/12] mm: hugetlb: introduce PageHugeInflight · Muchun Song <hidden> · 2021-01-17
[PATCH v13 05/12] mm: hugetlb: allocate the vmemmap pages associated with each HugeTLB page · Muchun Song <hidden> · 2021-01-17
Re: [PATCH v13 05/12] mm: hugetlb: allocate the vmemmap pages associated with each HugeTLB page · David Rientjes <rientjes@google.com> · 2021-01-25
Re: [External] Re: [PATCH v13 05/12] mm: hugetlb: allocate the vmemmap pages associated with each HugeTLB page · Muchun Song <hidden> · 2021-01-25
Re: [External] Re: [PATCH v13 05/12] mm: hugetlb: allocate the vmemmap pages associated with each HugeTLB page · Muchun Song <hidden> · 2021-01-25
Re: [External] Re: [PATCH v13 05/12] mm: hugetlb: allocate the vmemmap pages associated with each HugeTLB page · David Hildenbrand <hidden> · 2021-01-26
Re: [External] Re: [PATCH v13 05/12] mm: hugetlb: allocate the vmemmap pages associated with each HugeTLB page · Muchun Song <hidden> · 2021-01-26
Re: [External] Re: [PATCH v13 05/12] mm: hugetlb: allocate the vmemmap pages associated with each HugeTLB page · Mike Kravetz <hidden> · 2021-01-25
Re: [External] Re: [PATCH v13 05/12] mm: hugetlb: allocate the vmemmap pages associated with each HugeTLB page · Oscar Salvador <osalvador@suse.de> · 2021-01-26
Re: [PATCH v13 05/12] mm: hugetlb: allocate the vmemmap pages associated with each HugeTLB page · Oscar Salvador <osalvador@suse.de> · 2021-01-26
Re: [PATCH v13 05/12] mm: hugetlb: allocate the vmemmap pages associated with each HugeTLB page · David Hildenbrand <hidden> · 2021-01-26
Re: [PATCH v13 05/12] mm: hugetlb: allocate the vmemmap pages associated with each HugeTLB page · Oscar Salvador <osalvador@suse.de> · 2021-01-26
Re: [PATCH v13 05/12] mm: hugetlb: allocate the vmemmap pages associated with each HugeTLB page · David Hildenbrand <hidden> · 2021-01-26
Re: [PATCH v13 05/12] mm: hugetlb: allocate the vmemmap pages associated with each HugeTLB page · Oscar Salvador <osalvador@suse.de> · 2021-01-26
Re: [PATCH v13 05/12] mm: hugetlb: allocate the vmemmap pages associated with each HugeTLB page · David Hildenbrand <hidden> · 2021-01-26
Re: [PATCH v13 05/12] mm: hugetlb: allocate the vmemmap pages associated with each HugeTLB page · David Hildenbrand <hidden> · 2021-01-27
Re: [External] Re: [PATCH v13 05/12] mm: hugetlb: allocate the vmemmap pages associated with each HugeTLB page · Muchun Song <hidden> · 2021-01-28
Re: [External] Re: [PATCH v13 05/12] mm: hugetlb: allocate the vmemmap pages associated with each HugeTLB page · Muchun Song <hidden> · 2021-01-28
Re: [External] Re: [PATCH v13 05/12] mm: hugetlb: allocate the vmemmap pages associated with each HugeTLB page · Mike Kravetz <hidden> · 2021-01-29
Re: [External] Re: [PATCH v13 05/12] mm: hugetlb: allocate the vmemmap pages associated with each HugeTLB page · Muchun Song <hidden> · 2021-01-29
Re: [External] Re: [PATCH v13 05/12] mm: hugetlb: allocate the vmemmap pages associated with each HugeTLB page · David Hildenbrand <hidden> · 2021-02-01
Re: [External] Re: [PATCH v13 05/12] mm: hugetlb: allocate the vmemmap pages associated with each HugeTLB page · Mike Kravetz <hidden> · 2021-02-02
Re: [PATCH v13 05/12] mm: hugetlb: allocate the vmemmap pages associated with each HugeTLB page · Oscar Salvador <osalvador@suse.de> · 2021-01-28
Re: [External] Re: [PATCH v13 05/12] mm: hugetlb: allocate the vmemmap pages associated with each HugeTLB page · Muchun Song <hidden> · 2021-01-29
Re: [PATCH v13 05/12] mm: hugetlb: allocate the vmemmap pages associated with each HugeTLB page · David Hildenbrand <hidden> · 2021-02-01
[PATCH v13 04/12] mm: hugetlb: defer freeing of HugeTLB pages · Muchun Song <hidden> · 2021-01-17
Re: [PATCH v13 04/12] mm: hugetlb: defer freeing of HugeTLB pages · David Rientjes <rientjes@google.com> · 2021-01-24
Re: [External] Re: [PATCH v13 04/12] mm: hugetlb: defer freeing of HugeTLB pages · Muchun Song <hidden> · 2021-01-25
Re: [PATCH v13 00/12] Free some vmemmap pages of HugeTLB page · Muchun Song <hidden> · 2021-01-20
Re: [PATCH v13 00/12] Free some vmemmap pages of HugeTLB page · Oscar Salvador <osalvador@suse.de> · 2021-01-20
Re: [External] Re: [PATCH v13 00/12] Free some vmemmap pages of HugeTLB page · Muchun Song <hidden> · 2021-01-20

From: Muchun Song <hidden>
Date: 2021-01-25 04:01:01
Also in: linux-fsdevel, linux-mm, lkml

On Mon, Jan 25, 2021 at 7:55 AM David Rientjes [off-list ref] wrote:


On Sun, 17 Jan 2021, Muchun Song wrote:

quoted

In the subsequent patch, we should allocate the vmemmap pages when
freeing HugeTLB pages. But update_and_free_page() is always called
with holding hugetlb_lock, so we cannot use GFP_KERNEL to allocate
vmemmap pages. However, we can defer the actual freeing in a kworker
to prevent from using GFP_ATOMIC to allocate the vmemmap pages.

The update_hpage_vmemmap_workfn() is where the call to allocate
vmemmmap pages will be inserted.

I think it's reasonable to assume that userspace can release free hugetlb
pages from the pool on oom conditions when reclaim has become too
expensive.  This approach now requires that we can allocate vmemmap pages
in a potential oom condition as a prerequisite for freeing memory, which
seems less than ideal.

And, by doing this through a kworker, we can presumably get queued behind
another work item that requires memory to make forward progress in this
oom condition.

Two thoughts:

- We're going to be freeing the hugetlb page after we can allocate the
  vmemmap pages, so why do we need to allocate with GFP_KERNEL?  Can't we
  simply dip into memory reserves using GFP_ATOMIC (and thus can be
  holding hugetlb_lock) because we know we'll be freeing more memory than
  we'll be allocating?

Right.

  I think requiring a GFP_KERNEL allocation to block
  to free memory for vmemmap when we'll be freeing memory ourselves is
  dubious.  This simplifies all of this.

Thanks for your thoughts. I just thought that we can go to reclaim
when there is no memory in the system. But we cannot block when
using GFP_KERNEL. Actually, we cannot deal with fail of memory
allocating. In the next patch, I try to sleep 100ms and then try again
to allocate memory when allocating memory fails.

- If the answer is that we actually have to use GFP_KERNEL for other
  reasons, what are your thoughts on pre-allocating the vmemmap as opposed
  to deferring to a kworker?  In other words, preallocate the necessary
  memory with GFP_KERNEL and put it on a linked list in struct hstate
  before acquiring hugetlb_lock.

put_page() can be used in an atomic context. Actually, we cannot sleep
in the __free_huge_page(). It seems a little tricky. Right?

quoted

Signed-off-by: Muchun Song <redacted>
Reviewed-by: Mike Kravetz <redacted>
---
 mm/hugetlb.c         | 74 ++++++++++++++++++++++++++++++++++++++++++++++++++--
 mm/hugetlb_vmemmap.c | 12 ---------
 mm/hugetlb_vmemmap.h | 17 ++++++++++++
 3 files changed, 89 insertions(+), 14 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 140135fc8113..c165186ec2cf 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c

@@ -1292,15 +1292,85 @@ static inline void destroy_compound_gigantic_page(struct page *page,
                                              unsigned int order) { }
 #endif

-static void update_and_free_page(struct hstate *h, struct page *page)
+static void __free_hugepage(struct hstate *h, struct page *page);
+
+/*
+ * As update_and_free_page() is always called with holding hugetlb_lock, so we
+ * cannot use GFP_KERNEL to allocate vmemmap pages. However, we can defer the
+ * actual freeing in a workqueue to prevent from using GFP_ATOMIC to allocate
+ * the vmemmap pages.
+ *
+ * The update_hpage_vmemmap_workfn() is where the call to allocate vmemmmap
+ * pages will be inserted.
+ *
+ * update_hpage_vmemmap_workfn() locklessly retrieves the linked list of pages
+ * to be freed and frees them one-by-one. As the page->mapping pointer is going
+ * to be cleared in update_hpage_vmemmap_workfn() anyway, it is reused as the
+ * llist_node structure of a lockless linked list of huge pages to be freed.
+ */
+static LLIST_HEAD(hpage_update_freelist);
+
+static void update_hpage_vmemmap_workfn(struct work_struct *work)
 {
-     int i;
+     struct llist_node *node;
+
+     node = llist_del_all(&hpage_update_freelist);
+
+     while (node) {
+             struct page *page;
+             struct hstate *h;
+
+             page = container_of((struct address_space **)node,
+                                  struct page, mapping);
+             node = node->next;
+             page->mapping = NULL;
+             h = page_hstate(page);
+
+             spin_lock(&hugetlb_lock);
+             __free_hugepage(h, page);
+             spin_unlock(&hugetlb_lock);

+             cond_resched();

Wouldn't it be better to hold hugetlb_lock for the iteration rather than
constantly dropping it and reacquiring it?  Use
cond_resched_lock(&hugetlb_lock) instead?

Great. We can use it. Thanks.

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help