[PATCH 2/2] mm: hugetlb: support gigantic surplus pages

[PATCH 0/2] mm: fix the "counter.sh" failure for libhugetlbfs · Huang Shijie <hidden> · 2016-11-03
[PATCH 1/2] mm: hugetlb: rename some allocation functions · Huang Shijie <hidden> · 2016-11-03
[PATCH] mm: hugetlb: rename some allocation functions · Huang Shijie <hidden> · 2016-11-04
[PATCH 2/2] mm: hugetlb: support gigantic surplus pages · Huang Shijie <hidden> · 2016-11-03
Re: [PATCH 2/2] mm: hugetlb: support gigantic surplus pages · kbuild test robot <hidden> · 2016-11-03
Re: [PATCH 2/2] mm: hugetlb: support gigantic surplus pages · Gerald Schaefer <hidden> · 2016-11-07
Re: [PATCH 2/2] mm: hugetlb: support gigantic surplus pages · Huang Shijie <hidden> · 2016-11-08
Re: [PATCH 2/2] mm: hugetlb: support gigantic surplus pages · Huang Shijie <hidden> · 2016-11-08
Re: [PATCH 2/2] mm: hugetlb: support gigantic surplus pages · Huang Shijie <hidden> · 2016-11-08
Re: [PATCH 2/2] mm: hugetlb: support gigantic surplus pages · Gerald Schaefer <hidden> · 2016-11-08
Re: [PATCH 2/2] mm: hugetlb: support gigantic surplus pages · Huang Shijie <hidden> · 2016-11-09
[PATCH v2 2/2] mm: hugetlb: support gigantic surplus pages · Huang Shijie <hidden> · 2016-11-09
Re: [PATCH v2 2/2] mm: hugetlb: support gigantic surplus pages · Gerald Schaefer <hidden> · 2016-11-09
Re: [PATCH v2 2/2] mm: hugetlb: support gigantic surplus pages · Huang Shijie <hidden> · 2016-11-10
Re: [PATCH 0/2] mm: fix the "counter.sh" failure for libhugetlbfs · Randy Dunlap <hidden> · 2016-11-03
Re: [PATCH 0/2] mm: fix the "counter.sh" failure for libhugetlbfs · Huang Shijie <hidden> · 2016-11-04

From: Gerald Schaefer <hidden>
Date: 2016-11-08 19:27:50
Also in: linux-mm

On Tue, 8 Nov 2016 17:17:28 +0800
Huang Shijie [off-list ref] wrote:

quoted

I will look at the lockdep issue.

I tested the new patch (will be sent out later) on the arm64 platform,
and I did not meet the lockdep issue when I enabled the lockdep.
The following is my config:

	CONFIG_LOCKD=y
	CONFIG_LOCKD_V4=y
	CONFIG_LOCKUP_DETECTOR=y
        # CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC is not set
	CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC_VALUE=0
	CONFIG_DEBUG_SPINLOCK=y
	CONFIG_DEBUG_LOCK_ALLOC=y
	CONFIG_PROVE_LOCKING=y
	CONFIG_LOCKDEP=y
	CONFIG_LOCK_STAT=y
	CONFIG_DEBUG_LOCKDEP=y
	CONFIG_DEBUG_LOCKING_API_SELFTESTS=y
	
So do I miss something?

Those options should be OK. Meanwhile I looked into this a little more,
and the problematic line/lock is spin_lock_irqsave(&z->lock, flags) at
the top of alloc_gigantic_page(). From the lockdep trace we see that
it is triggered by an mmap(), and then hugetlb_acct_memory() ->
__alloc_huge_page() -> alloc_gigantic_page().

However, in between those functions (inside gather_surplus_pages())
a NUMA_NO_NODE node id comes into play. And this finally results in
alloc_gigantic_page() being called with NUMA_NO_NODE as nid (which is
-1), and NODE_DATA(nid)->node_zones will then reach into Nirvana.

So, I guess the problem is a missing NUMA_NO_NODE check in
alloc_gigantic_page(), similar to the one in
__hugetlb_alloc_buddy_huge_page(). And somehow this was not a problem
before the gigantic surplus change.

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help