Re: [RFC PATCH 1/8] hugetlb: add per-hstate mutex to synchronize user adjustments

[RFC PATCH 0/8] make hugetlb put_page safe for all calling contexts · Mike Kravetz <hidden> · 2021-03-19
[RFC PATCH 3/8] hugetlb: create remove_hugetlb_page() to separate functionality · Mike Kravetz <hidden> · 2021-03-19
Re: [RFC PATCH 3/8] hugetlb: create remove_hugetlb_page() to separate functionality · Michal Hocko <mhocko@suse.com> · 2021-03-22
Re: [RFC PATCH 3/8] hugetlb: create remove_hugetlb_page() to separate functionality · Mike Kravetz <hidden> · 2021-03-22
[RFC PATCH 2/8] hugetlb: recompute min_count when dropping hugetlb_lock · Mike Kravetz <hidden> · 2021-03-19
Re: [RFC PATCH 2/8] hugetlb: recompute min_count when dropping hugetlb_lock · Michal Hocko <mhocko@suse.com> · 2021-03-22
Re: [RFC PATCH 2/8] hugetlb: recompute min_count when dropping hugetlb_lock · Mike Kravetz <hidden> · 2021-03-22
Re: [RFC PATCH 2/8] hugetlb: recompute min_count when dropping hugetlb_lock · Michal Hocko <mhocko@suse.com> · 2021-03-23
Re: [RFC PATCH 2/8] hugetlb: recompute min_count when dropping hugetlb_lock · Peter Zijlstra <peterz@infradead.org> · 2021-03-23
Re: [RFC PATCH 2/8] hugetlb: recompute min_count when dropping hugetlb_lock · Michal Hocko <mhocko@suse.com> · 2021-03-23
Re: [RFC PATCH 2/8] hugetlb: recompute min_count when dropping hugetlb_lock · Mike Kravetz <hidden> · 2021-03-23
Re: [RFC PATCH 2/8] hugetlb: recompute min_count when dropping hugetlb_lock · Michal Hocko <mhocko@suse.com> · 2021-03-24
Re: [RFC PATCH 2/8] hugetlb: recompute min_count when dropping hugetlb_lock · Mike Kravetz <hidden> · 2021-03-24
[RFC PATCH 1/8] hugetlb: add per-hstate mutex to synchronize user adjustments · Mike Kravetz <hidden> · 2021-03-19
Re: [RFC PATCH 1/8] hugetlb: add per-hstate mutex to synchronize user adjustments · Michal Hocko <mhocko@suse.com> · 2021-03-22
Re: [RFC PATCH 1/8] hugetlb: add per-hstate mutex to synchronize user adjustments · Mike Kravetz <hidden> · 2021-03-22
Re: [RFC PATCH 1/8] hugetlb: add per-hstate mutex to synchronize user adjustments · Michal Hocko <mhocko@suse.com> · 2021-03-23
[RFC PATCH 8/8] hugetlb: track hugetlb pages allocated via cma_alloc · Mike Kravetz <hidden> · 2021-03-19
[RFC PATCH 4/8] hugetlb: call update_and_free_page without hugetlb_lock · Mike Kravetz <hidden> · 2021-03-19
Re: [RFC PATCH 4/8] hugetlb: call update_and_free_page without hugetlb_lock · Michal Hocko <mhocko@suse.com> · 2021-03-22
[RFC PATCH 5/8] hugetlb: change free_pool_huge_page to remove_pool_huge_page · Mike Kravetz <hidden> · 2021-03-19
Re: [RFC PATCH 5/8] hugetlb: change free_pool_huge_page to remove_pool_huge_page · Michal Hocko <mhocko@suse.com> · 2021-03-22
Re: [RFC PATCH 5/8] hugetlb: change free_pool_huge_page to remove_pool_huge_page · Mike Kravetz <hidden> · 2021-03-22
Re: [RFC PATCH 5/8] hugetlb: change free_pool_huge_page to remove_pool_huge_page · Michal Hocko <mhocko@suse.com> · 2021-03-23
Re: [RFC PATCH 5/8] hugetlb: change free_pool_huge_page to remove_pool_huge_page · Mike Kravetz <hidden> · 2021-03-24
Re: [RFC PATCH 5/8] hugetlb: change free_pool_huge_page to remove_pool_huge_page · Michal Hocko <mhocko@suse.com> · 2021-03-24
Re: [RFC PATCH 5/8] hugetlb: change free_pool_huge_page to remove_pool_huge_page · Mike Kravetz <hidden> · 2021-03-24
Re: [RFC PATCH 5/8] hugetlb: change free_pool_huge_page to remove_pool_huge_page · Michal Hocko <mhocko@suse.com> · 2021-03-24
[RFC PATCH 6/8] hugetlb: make free_huge_page irq safe · Mike Kravetz <hidden> · 2021-03-19
Re: [RFC PATCH 6/8] hugetlb: make free_huge_page irq safe · Mike Kravetz <hidden> · 2021-03-21
[hugetlb] cd190f60f9: BUG:sleeping_function_called_from_invalid_context_at_mm/hugetlb.c · kernel test robot <hidden> · 2021-03-22
Re: [RFC PATCH 6/8] hugetlb: make free_huge_page irq safe · Michal Hocko <mhocko@suse.com> · 2021-03-22
[RFC PATCH 7/8] hugetlb: add update_and_free_page_no_sleep for irq context · Mike Kravetz <hidden> · 2021-03-19
Re: [RFC PATCH 7/8] hugetlb: add update_and_free_page_no_sleep for irq context · Hillf Danton <hidden> · 2021-03-20
Re: [RFC PATCH 7/8] hugetlb: add update_and_free_page_no_sleep for irq context · Mike Kravetz <hidden> · 2021-03-25
Re: [RFC PATCH 7/8] hugetlb: add update_and_free_page_no_sleep for irq context · Peter Zijlstra <peterz@infradead.org> · 2021-03-22
Re: [RFC PATCH 7/8] hugetlb: add update_and_free_page_no_sleep for irq context · Mike Kravetz <hidden> · 2021-03-22
Re: [RFC PATCH 7/8] hugetlb: add update_and_free_page_no_sleep for irq context · Roman Gushchin <hidden> · 2021-03-22
Re: [RFC PATCH 7/8] hugetlb: add update_and_free_page_no_sleep for irq context · Mike Kravetz <hidden> · 2021-03-23
Re: [RFC PATCH 7/8] hugetlb: add update_and_free_page_no_sleep for irq context · Roman Gushchin <hidden> · 2021-03-23
Re: [RFC PATCH 7/8] hugetlb: add update_and_free_page_no_sleep for irq context · Michal Hocko <mhocko@suse.com> · 2021-03-24
Re: [RFC PATCH 7/8] hugetlb: add update_and_free_page_no_sleep for irq context · Mike Kravetz <hidden> · 2021-03-24
Re: [RFC PATCH 7/8] hugetlb: add update_and_free_page_no_sleep for irq context · Peter Zijlstra <peterz@infradead.org> · 2021-03-22
Re: [RFC PATCH 7/8] hugetlb: add update_and_free_page_no_sleep for irq context · Michal Hocko <mhocko@suse.com> · 2021-03-22
Re: [RFC PATCH 7/8] hugetlb: add update_and_free_page_no_sleep for irq context · Michal Hocko <mhocko@suse.com> · 2021-03-22

From: Michal Hocko <mhocko@suse.com>
Date: 2021-03-23 07:49:06
Also in: lkml

On Mon 22-03-21 09:57:14, Mike Kravetz wrote:

On 3/22/21 6:59 AM, Michal Hocko wrote:

quoted

On Fri 19-03-21 15:42:02, Mike Kravetz wrote:

quoted

The number of hugetlb pages can be adjusted by writing to the
sysps/proc files nr_hugepages, nr_hugepages_mempolicy or
nr_overcommit_hugepages.  There is nothing to prevent two
concurrent modifications via these files.  The underlying routine
set_max_huge_pages() makes assumptions that only one occurrence is
running at a time.  Specifically, alloc_pool_huge_page uses a
hstate specific variable without any synchronization.

From the above it is not really clear whether the unsynchronized nature
of set_max_huge_pages is really a problem or a mere annoynce. I suspect
the later because counters are properly synchronized with the
hugetlb_lock. It would be great to clarify that.

It is a problem and an annoyance.

The problem is that alloc_pool_huge_page -> for_each_node_mask_to_alloc is
called after dropping the hugetlb lock.  for_each_node_mask_to_alloc
uses the helper hstate_next_node_to_alloc which uses and modifies
h->next_nid_to_alloc.  Worst case would be two instances of set_max_huge_pages
trying to allocate pages on different sets of nodes.  Pages could get
allocated on the wrong nodes.

Yes, what I meant by the annoyance. On the other hand a parallel access
to a global knob mantaining a global resource should be expected to
have some side effects without an external synchronization unless it is
explicitly documented that such an access is synchronized internally.

I really doubt this problem has ever been experienced in practice.
However, when looking at the code in was a real annoyance. :)

IMHO it would be a bit of a stretch to consider it a real life problem.

I'll update the commit message to be more clear.

Thanks! Clarification will definitely help.
-- 
Michal Hocko
SUSE Labs

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help