Re:
From: Yang Shi <hidden>
Date: 2022-08-31 21:47:30
Also in:
linux-mm
Hi Zach, I did a quick look at the series, basically no show stopper to me. But I didn't find time to review them thoroughly yet, quite busy on something else. Just a heads up, I didn't mean to ignore you. I will review them when I find some time. Thanks, Yang On Fri, Aug 26, 2022 at 3:03 PM Zach O'Keefe [off-list ref] wrote:
Subject: [PATCH mm-unstable v2 0/9] mm: add file/shmem support to MADV_COLLAPSE
v2 Forward
Mostly a RESEND: rebase on latest mm-unstable + minor bug fixes from
kernel test robot.
--------------------------------
This series builds on top of the previous "mm: userspace hugepage collapse"
series which introduced the MADV_COLLAPSE madvise mode and added support
for private, anonymous mappings[1], by adding support for file and shmem
backed memory to CONFIG_READ_ONLY_THP_FOR_FS=y kernels.
File and shmem support have been added with effort to align with existing
MADV_COLLAPSE semantics and policy decisions[2]. Collapse of shmem-backed
memory ignores kernel-guiding directives and heuristics including all
sysfs settings (transparent_hugepage/shmem_enabled), and tmpfs huge= mount
options (shmem always supports large folios). Like anonymous mappings, on
successful return of MADV_COLLAPSE on file/shmem memory, the contents of
memory mapped by the addresses provided will be synchronously pmd-mapped
THPs.
This functionality unlocks two important uses:
(1) Immediately back executable text by THPs. Current support provided
by CONFIG_READ_ONLY_THP_FOR_FS may take a long time on a large
system which might impair services from serving at their full rated
load after (re)starting. Tricks like mremap(2)'ing text onto
anonymous memory to immediately realize iTLB performance prevents
page sharing and demand paging, both of which increase steady state
memory footprint. Now, we can have the best of both worlds: Peak
upfront performance and lower RAM footprints.
(2) userfaultfd-based live migration of virtual machines satisfy UFFD
faults by fetching native-sized pages over the network (to avoid
latency of transferring an entire hugepage). However, after guest
memory has been fully copied to the new host, MADV_COLLAPSE can
be used to immediately increase guest performance.
khugepaged has received a small improvement by association and can now
detect and collapse pte-mapped THPs. However, there is still work to be
done along the file collapse path. Compound pages of arbitrary order still
needs to be supported and THP collapse needs to be converted to using
folios in general. Eventually, we'd like to move away from the read-only
and executable-mapped constraints currently imposed on eligible files and
support any inode claiming huge folio support. That said, I think the
series as-is covers enough to claim that MADV_COLLAPSE supports file/shmem
memory.
Patches 1-3 Implement the guts of the series.
Patch 4 Is a tracepoint for debugging.
Patches 5-8 Refactor existing khugepaged selftests to work with new
memory types.
Patch 9 Adds a userfaultfd selftest mode to mimic a functional test
of UFFDIO_REGISTER_MODE_MINOR+MADV_COLLAPSE live migration.
Applies against mm-unstable.
[1] https://lore.kernel.org/linux-mm/20220706235936.2197195-1-zokeefe@google.com/ (local)
[2] https://lore.kernel.org/linux-mm/YtBmhaiPHUTkJml8@google.com/ (local)
v1 -> v2:
- Add missing definition for khugepaged_add_pte_mapped_thp() in
!CONFIG_SHEM builds, in "mm/khugepaged: attempt to map
file/shmem-backed pte-mapped THPs by pmds"
- Minor bugfixes in "mm/madvise: add file and shmem support to
MADV_COLLAPSE" for !CONFIG_SHMEM, !CONFIG_TRANSPARENT_HUGEPAGE and some
compiler settings.
- Rebased on latest mm-unstable
Zach O'Keefe (9):
mm/shmem: add flag to enforce shmem THP in hugepage_vma_check()
mm/khugepaged: attempt to map file/shmem-backed pte-mapped THPs by
pmds
mm/madvise: add file and shmem support to MADV_COLLAPSE
mm/khugepaged: add tracepoint to hpage_collapse_scan_file()
selftests/vm: dedup THP helpers
selftests/vm: modularize thp collapse memory operations
selftests/vm: add thp collapse file and tmpfs testing
selftests/vm: add thp collapse shmem testing
selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory
include/linux/khugepaged.h | 13 +-
include/linux/shmem_fs.h | 10 +-
include/trace/events/huge_memory.h | 36 +
kernel/events/uprobes.c | 2 +-
mm/huge_memory.c | 2 +-
mm/khugepaged.c | 289 ++++--
mm/shmem.c | 18 +-
tools/testing/selftests/vm/Makefile | 2 +
tools/testing/selftests/vm/khugepaged.c | 828 ++++++++++++------
tools/testing/selftests/vm/soft-dirty.c | 2 +-
.../selftests/vm/split_huge_page_test.c | 12 +-
tools/testing/selftests/vm/userfaultfd.c | 171 +++-
tools/testing/selftests/vm/vm_util.c | 36 +-
tools/testing/selftests/vm/vm_util.h | 5 +-
14 files changed, 1040 insertions(+), 386 deletions(-)
--
2.37.2.672.g94769d06f0-goog