Thread (5 messages) 5 messages, 3 authors, 2012-08-01

Re: [PATCH v2] list corruption by gather_surp

From: Michal Hocko <hidden>
Date: 2012-08-01 06:51:16

On Tue 31-07-12 18:13:06, Cliff Wickman wrote:
On Mon, Jul 30, 2012 at 02:22:24PM +0200, Michal Hocko wrote:
quoted
On Fri 27-07-12 17:32:15, Cliff Wickman wrote:
quoted
From: Cliff Wickman <redacted>

v2: diff'd against linux-next

I am seeing list corruption occurring from within gather_surplus_pages()
(mm/hugetlb.c).  The problem occurs in a RHEL6 kernel under a heavy load,
and seems to be because this function drops the hugetlb_lock.
The list_add() in gather_surplus_pages() seems to need to be protected by
the lock.
(I don't have a similar test for a linux-next kernel)
Because you cannot reproduce or you just didn't test it with linux-next?
quoted
I have CONFIG_DEBUG_LIST=y, and am running an MPI application with 64 threads
and a library that creates a large heap of hugetlbfs pages for it.

The below patch fixes the problem.
The gist of this patch is that gather_surplus_pages() does not have to drop
But you cannot hold spinlock while allocating memory because the
allocation is not atomic and you could deadlock easily.
quoted
the lock if alloc_buddy_huge_page() is told whether the lock is already held.
The changelog doesn't actually explain how does the list gets corrupted.
alloc_buddy_huge_page doesn't provide the freshly allocated page to use
so nobody could get and free it. enqueue_huge_page happens under hugetlb_lock.
I am sorry but I do not see how we could race here.
I finally got my test running on a linux-next kernel and could not
reproduce the problem.  
So I agree that no race seems possible now.   Disregard this patch.

I'll offer the fix to the distro of the old kernel on which I saw the
problem.
But please note that the patch is not correct as mentioned above.

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help