Thread (45 messages) 45 messages, 6 authors, 2021-03-25

Re: [RFC PATCH 1/8] hugetlb: add per-hstate mutex to synchronize user adjustments

From: Mike Kravetz <hidden>
Date: 2021-03-22 16:59:07
Also in: lkml

On 3/22/21 6:59 AM, Michal Hocko wrote:
On Fri 19-03-21 15:42:02, Mike Kravetz wrote:
quoted
The number of hugetlb pages can be adjusted by writing to the
sysps/proc files nr_hugepages, nr_hugepages_mempolicy or
nr_overcommit_hugepages.  There is nothing to prevent two
concurrent modifications via these files.  The underlying routine
set_max_huge_pages() makes assumptions that only one occurrence is
running at a time.  Specifically, alloc_pool_huge_page uses a
hstate specific variable without any synchronization.
From the above it is not really clear whether the unsynchronized nature
of set_max_huge_pages is really a problem or a mere annoynce. I suspect
the later because counters are properly synchronized with the
hugetlb_lock. It would be great to clarify that.
 
It is a problem and an annoyance.

The problem is that alloc_pool_huge_page -> for_each_node_mask_to_alloc is
called after dropping the hugetlb lock.  for_each_node_mask_to_alloc
uses the helper hstate_next_node_to_alloc which uses and modifies
h->next_nid_to_alloc.  Worst case would be two instances of set_max_huge_pages
trying to allocate pages on different sets of nodes.  Pages could get
allocated on the wrong nodes.

I really doubt this problem has ever been experienced in practice.
However, when looking at the code in was a real annoyance. :)

I'll update the commit message to be more clear.
-- 
Mike Kravetz
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help