[PATCH v2 2/2] mm: hugetlb: support gigantic surplus pages
From: Huang Shijie <hidden>
Date: 2016-11-10 07:04:11
Also in:
linux-mm
On Wed, Nov 09, 2016 at 04:55:49PM +0100, Gerald Schaefer wrote:
quoted
index 9fdfc24..5dbfd62 100644--- a/mm/hugetlb.c +++ b/mm/hugetlb.c@@ -1095,6 +1095,9 @@ static struct page *alloc_gigantic_page(int nid, unsigned int order) unsigned long ret, pfn, flags; struct zone *z; + if (nid == NUMA_NO_NODE) + nid = numa_mem_id(); +Now counter.sh works (on s390) w/o the lockdep warning. However, it looks
Good news to me :) We have found the root cause of the s390 issue.
like this change will now result in inconsistent behavior compared to the normal sized hugepages, regarding surplus page allocation. Setting nid to numa_mem_id() means that only the node of the current CPU will be considered for allocating a gigantic page, as opposed to just "preferring" the current node in the normal size case (__hugetlb_alloc_buddy_huge_page() -> alloc_pages_node()) with a fallback to using other nodes.
Yes.
I am not really familiar with NUMA, and I might be wrong here, but if this is true then gigantic pages, which may be hard allocate at runtime in general, will be even harder to find (as surplus pages) because you only look on the current node.
Okay, I will try to fix this in the next version.
I honestly do not understand why alloc_gigantic_page() needs a nid parameter at all, since it looks like it will only be called from alloc_fresh_gigantic_page_node(), which in turn is only called from alloc_fresh_gigantic_page() in a "for_each_node" loop (at least before your patch). Now it could be an option to also use alloc_fresh_gigantic_page() in your patch, instead of directly calling alloc_gigantic_page(),
Yes, a good suggestion. But I need to do some change to the alloc_fresh_gigantic_page(). Thanks Huang Shijie