Re: [PATCH 1/2] mm: Make alloc_contig_range handle free hugetlb pages
From: Michal Hocko <mhocko@suse.com>
Date: 2021-02-17 15:02:36
Also in:
lkml
On Wed 17-02-21 11:08:15, Oscar Salvador wrote: [...]
+static bool alloc_and_dissolve_huge_page(struct hstate *h, struct page *page)
+{
+ gfp_t gfp_mask = htlb_alloc_mask(h);
+ nodemask_t *nmask = &node_states[N_MEMORY];
+ struct page *new_page;
+ bool ret = false;
+ int nid;
+
+ spin_lock(&hugetlb_lock);
+ /*
+ * Check one more time to make race-window smaller.
+ */
+ if (!PageHuge(page)) {
+ /*
+ * Dissolved from under our feet.
+ */
+ spin_unlock(&hugetlb_lock);
+ return true;
+ }Is this really necessary? dissolve_free_huge_page will take care of this and the race windown you are covering is really tiny.
+ + nid = page_to_nid(page); + spin_unlock(&hugetlb_lock); + + /* + * Before dissolving the page, we need to allocate a new one, + * so the pool remains stable. + */ + new_page = alloc_fresh_huge_page(h, gfp_mask, nid, nmask, NULL);
wrt. fallback to other zones, I haven't realized that the primary usecase is a form of memory offlining (from virt-mem). I am not yet sure what the proper behavior is in that case but if breaking hugetlb pools, similar to the normal hotplug operation, is viable then this needs a special mode. We do not want a random alloc_contig_range user to do the same. So for starter I would go with __GFP_THISNODE here.
+ if (new_page) {
+ /*
+ * Ok, we got a new free hugepage to replace this one. Try to
+ * dissolve the old page.
+ */
+ if (!dissolve_free_huge_page(page)) {
+ ret = true;
+ } else if (dissolve_free_huge_page(new_page)) {
+ /*
+ * Seems the old page could not be dissolved, so try to
+ * dissolve the freshly allocated page. If that fails
+ * too, let us count the new page as a surplus. Doing so
+ * allows the pool to be re-balanced when pages are freed
+ * instead of enqueued again.
+ */
+ spin_lock(&hugetlb_lock);
+ h->surplus_huge_pages++;
+ h->surplus_huge_pages_node[nid]++;
+ spin_unlock(&hugetlb_lock);
+ }
+ /*
+ * Free it into the hugepage allocator
+ */
+ put_page(new_page);
+ }
+
+ return ret;
+}
+
+bool isolate_or_dissolve_huge_page(struct page *page)
+{
+ struct hstate *h = NULL;
+ struct page *head;
+ bool ret = false;
+
+ spin_lock(&hugetlb_lock);
+ if (PageHuge(page)) {
+ head = compound_head(page);
+ h = page_hstate(head);
+ }
+ spin_unlock(&hugetlb_lock);
+
+ if (!h)
+ /*
+ * The page might have been dissolved from under our feet.
+ * If that is the case, return success as if we dissolved it
+ * ourselves.
+ */
+ return true;
nit I would put the comment above the conditin for both cases. It reads
more easily that way. At least without { }.
+
+ if (hstate_is_gigantic(h))
+ /*
+ * Fence off gigantic pages as there is a cyclic dependency
+ * between alloc_contig_range and them.
+ */
+ return ret;
+
+ if(!page_count(head) && alloc_and_dissolve_huge_page(h, head))
+ ret = true;
+
+ return ret;
+}
+
struct page *alloc_huge_page(struct vm_area_struct *vma,
unsigned long addr, int avoid_reserve)
{Other than that I haven't noticed any surprises. -- Michal Hocko SUSE Labs