Re: [PATCH] mm, hugetlb: Avoid double clearing for hugetlb pages

From: David Hildenbrand <hidden>
Date: 2020-10-20 13:36:31
Also in: linux-hardening, linux-mm

On 20.10.20 10:20, Michal Hocko wrote:

On Mon 19-10-20 15:28:53, Guilherme G. Piccoli wrote:
[...]

quoted

$ time echo 32768 > /proc/sys/vm/nr_hugepages
real    0m24.189s
user    0m0.000s
sys     0m24.184s

$ cat /proc/meminfo |grep "MemA\|Hugetlb"
MemAvailable:   30784732 kB
Hugetlb:        67108864 kB

* Without this patch, init_on_alloc=0
$ cat /proc/meminfo |grep "MemA\|Hugetlb"
MemAvailable:   97892752 kB
Hugetlb:               0 kB

$ time echo 32768 > /proc/sys/vm/nr_hugepages
real    0m0.316s
user    0m0.000s
sys     0m0.316s

Yes zeroying is quite costly and that is to be expected when the feature
is enabled. Hugetlb like other allocator users perform their own
initialization rather than go through __GFP_ZERO path. More on that
below.

Could you be more specific about why this is a problem. Hugetlb pool is
usualy preallocatd once during early boot. 24s for 65GB of 2MB pages
is non trivial amount of time but it doens't look like a major disaster
either. If the pool is allocated later it can take much more time due to
memory fragmentation.

I definitely do not want to downplay this but I would like to hear about
the real life examples of the problem.

[...]

quoted

Hi everybody, thanks in advance for the review/comments. I'd like to
point 2 things related to the implementation:

1) I understand that adding GFP flags is not really welcome by the
mm community; I've considered passing that as function parameter but
that would be a hacky mess, so I decided to add the flag since it seems
this is a fair use of the flag mechanism (to control actions on pages).
If anybody has a better/simpler suggestion to implement this, I'm all
ears - thanks!

This has been discussed already (http://lkml.kernel.org/r/20190514143537.10435-4-glider@google.com.
Previously it has been brought up in SLUB context AFAIR. Your numbers
are quite clear here but do we really need a gfp flag with all the
problems we tend to grow in with them?

One potential way around this specifically for hugetlb would be to use
__GFP_ZERO when allocating from the allocator and marking the fact in
the struct page while it is sitting in the pool. Page fault handler
could then skip the zeroying phase. Not an act of beauty TBH but it
fits into the existing model of the full control over initialization.
Btw. it would allow to implement init_on_free semantic as well. I
haven't implemented the actual two main methods
hugetlb_test_clear_pre_init_page and hugetlb_mark_pre_init_page because
I am not entirely sure about the current state of hugetlb struct page in
the pool. But there should be a lot of room in there (or in tail pages).
Mike will certainly know much better. But the skeleton of the patch
would look like something like this (not even compile tested).

Something like that is certainly nicer than proposed gfp flags.
(__GFP_NOINIT_ON_ALLOC is just ugly, especially, to optimize such
corner-case features)


-- 
Thanks,

David / dhildenb

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help