Thread (29 messages) 29 messages, 4 authors, 2021-02-06

Re: [PATCH v14 0/8] Free some vmemmap pages of HugeTLB page

From: Joao Martins <hidden>
Date: 2021-02-05 22:24:54
Also in: linux-fsdevel, linux-mm, lkml

On 2/4/21 3:50 AM, Muchun Song wrote:
Hi all,
[...]
When a HugeTLB is freed to the buddy system, we should allocate 6 pages for
vmemmap pages and restore the previous mapping relationship.

Apart from 2MB HugeTLB page, we also have 1GB HugeTLB page. It is similar
to the 2MB HugeTLB page. We also can use this approach to free the vmemmap
pages.

In this case, for the 1GB HugeTLB page, we can save 4094 pages. This is a
very substantial gain. On our server, run some SPDK/QEMU applications which
will use 1024GB hugetlbpage. With this feature enabled, we can save ~16GB
(1G hugepage)/~12GB (2MB hugepage) memory.

Because there are vmemmap page tables reconstruction on the freeing/allocating
path, it increases some overhead. Here are some overhead analysis.
[...]
Although the overhead has increased, the overhead is not significant. Like Mike
said, "However, remember that the majority of use cases create hugetlb pages at
or shortly after boot time and add them to the pool. So, additional overhead is
at pool creation time. There is no change to 'normal run time' operations of
getting a page from or returning a page to the pool (think page fault/unmap)".
Despite the overhead and in addition to the memory gains from this series ...
there's an additional benefit there isn't talked here with your vmemmap page
reuse trick. That is page (un)pinners will see an improvement and I presume because
there are fewer memmap pages and thus the tail/head pages are staying in cache more
often.

Out of the box I saw (when comparing linux-next against linux-next + this series)
with gup_test and pinning a 16G hugetlb file (with 1G pages):

	get_user_pages(): ~32k -> ~9k
	unpin_user_pages(): ~75k -> ~70k

Usually any tight loop fetching compound_head(), or reading tail pages data (e.g.
compound_head) benefit a lot. There's some unpinning inefficiencies I am fixing[0], but
with that in added it shows even more:

	unpin_user_pages(): ~27k -> ~3.8k

FWIW, I was also seeing that with devdax and the ZONE_DEVICE vmemmap page reuse equivalent
series[1] but it was mixed with other numbers.

Anyways, JFYI :)

	Joao

[0] https://lore.kernel.org/linux-mm/20210204202500.26474-1-joao.m.martins@oracle.com/ (local)
[1] https://lore.kernel.org/linux-mm/20201208172901.17384-1-joao.m.martins@oracle.com/ (local)
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help