Thread (33 messages) 33 messages, 4 authors, 2021-02-19

Re: [PATCH 1/2] mm: Make alloc_contig_range handle free hugetlb pages

From: Michal Hocko <mhocko@suse.com>
Date: 2021-02-17 15:02:36
Also in: lkml

On Wed 17-02-21 11:08:15, Oscar Salvador wrote:
[...]
+static bool alloc_and_dissolve_huge_page(struct hstate *h, struct page *page)
+{
+	gfp_t gfp_mask = htlb_alloc_mask(h);
+	nodemask_t *nmask = &node_states[N_MEMORY];
+	struct page *new_page;
+	bool ret = false;
+	int nid;
+
+	spin_lock(&hugetlb_lock);
+	/*
+	 * Check one more time to make race-window smaller.
+	 */
+	if (!PageHuge(page)) {
+		/*
+		 * Dissolved from under our feet.
+		 */
+		spin_unlock(&hugetlb_lock);
+		return true;
+	}
Is this really necessary? dissolve_free_huge_page will take care of this
and the race windown you are covering is really tiny.
+
+	nid = page_to_nid(page);
+	spin_unlock(&hugetlb_lock);
+
+	/*
+	 * Before dissolving the page, we need to allocate a new one,
+	 * so the pool remains stable.
+	 */
+	new_page = alloc_fresh_huge_page(h, gfp_mask, nid, nmask, NULL);
wrt. fallback to other zones, I haven't realized that the primary
usecase is a form of memory offlining (from virt-mem). I am not yet sure
what the proper behavior is in that case but if breaking hugetlb pools,
similar to the normal hotplug operation, is viable then this needs a
special mode. We do not want a random alloc_contig_range user to do the
same. So for starter I would go with __GFP_THISNODE here.
+	if (new_page) {
+		/*
+		 * Ok, we got a new free hugepage to replace this one. Try to
+		 * dissolve the old page.
+		 */
+		if (!dissolve_free_huge_page(page)) {
+			ret = true;
+		} else if (dissolve_free_huge_page(new_page)) {
+			/*
+			 * Seems the old page could not be dissolved, so try to
+			 * dissolve the freshly allocated page. If that fails
+			 * too, let us count the new page as a surplus. Doing so
+			 * allows the pool to be re-balanced when pages are freed
+			 * instead of enqueued again.
+			 */
+			spin_lock(&hugetlb_lock);
+			h->surplus_huge_pages++;
+			h->surplus_huge_pages_node[nid]++;
+			spin_unlock(&hugetlb_lock);
+		}
+		/*
+		 * Free it into the hugepage allocator
+		 */
+		put_page(new_page);
+	}
+
+	return ret;
+}
+
+bool isolate_or_dissolve_huge_page(struct page *page)
+{
+	struct hstate *h = NULL;
+	struct page *head;
+	bool ret = false;
+
+	spin_lock(&hugetlb_lock);
+	if (PageHuge(page)) {
+		head = compound_head(page);
+		h = page_hstate(head);
+	}
+	spin_unlock(&hugetlb_lock);
+
+	if (!h)
+		/*
+		 * The page might have been dissolved from under our feet.
+		 * If that is the case, return success as if we dissolved it
+		 * ourselves.
+		 */
+		return true;
nit I would put the comment above the conditin for both cases. It reads
more easily that way. At least without { }.
+
+	if (hstate_is_gigantic(h))
+		/*
+		 * Fence off gigantic pages as there is a cyclic dependency
+		 * between alloc_contig_range and them.
+		 */
+		return ret;
+
+	if(!page_count(head) && alloc_and_dissolve_huge_page(h, head))
+		ret = true;
+
+	return ret;
+}
+
 struct page *alloc_huge_page(struct vm_area_struct *vma,
 				    unsigned long addr, int avoid_reserve)
 {
Other than that I haven't noticed any surprises.
-- 
Michal Hocko
SUSE Labs
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help