Thread (15 messages) 15 messages, 5 authors, 2025-03-19

Re: [PATCH 1/2] fs/proc/task_mmu: add guard region bit to pagemap

From: Lorenzo Stoakes <hidden>
Date: 2025-03-19 19:13:14
Also in: linux-doc, linux-fsdevel, linux-kselftest, linux-mm, lkml

+cc Greg for stable question

On Wed, Mar 19, 2025 at 11:22:40AM -0700, Andrei Vagin wrote:
On Mon, Feb 24, 2025 at 2:39 AM David Hildenbrand [off-list ref] wrote:
quoted
On 24.02.25 11:18, Lorenzo Stoakes wrote:
[snip]
quoted
quoted
quoted
Acked-by: David Hildenbrand <redacted>
Thanks! :)
quoted
Something that might be interesting is also extending the PAGEMAP_SCAN
ioctl.
Yeah, funny you should mention that, I did see that, but on reading the man
page it struck me that it requires the region to be uffd afaict? All the
tests seem to establish uffd, and the man page implies it:

        To start tracking the written state (flag) of a page or range of
        memory, the UFFD_FEATURE_WP_ASYNC must be enabled by UFFDIO_API
        ioctl(2) on userfaultfd and memory range must be registered with
        UFFDIO_REGISTER ioctl(2) in UFFDIO_REGISTER_MODE_WP mode.

It would be a bit of a weird edge case to add support there. I was excited
when I first saw this ioctl, then disappointed afterwards... but maybe I
got it wrong?
quoted
quoted
I never managed to review that fully, but I thing that
UFFD_FEATURE_WP_ASYNC thingy is only required for PM_SCAN_CHECK_WPASYNC
and PM_SCAN_WP_MATCHING.

See pagemap_scan_test_walk().

I do recall that it works on any VMA.

Ah yes, tools/testing/selftests/mm/vm_util.c ends up using it for
pagemap_is_swapped() and friends via page_entry_is() to sanity check
that what pagemap gives us is consistent with what pagemap_scan gives us.

So it should work independent of the uffd magic.
I might be wrong, though ...

PAGEMAP_SCAN can work without the UFFD magic. CRIU utilizes PAGEMAP_SCAN
as a more efficient alternative to /proc/pid/pagemap:
https://github.com/checkpoint-restore/criu/blob/d18912fc88f3dc7bde5fdfa3575691977eb21753/criu/pagemap-cache.c#L178
Yeah we ascertained that - is on my list, LSF coming up next week means we
aren't great on timing here, but I'll prioritise this. When I'm back.
For CRIU, obtaining information about guard regions is critical.
Without this functionality in the kernel, CRIU is broken. We probably should
consider backporting these changes to the 6.13 and 6.14 stable branches.
I'm not sure on precedent for backporting a feature like this - Greg? Am
happy to do it though.

As a stop gap we can backport the pagemap feature if Greg feels this is
appropriate?

[snip]
quoted
My thinking was, that if you have a large VMA, with ordinary pagemap you
have to copy 8byte per entry (and have room for that somewhere in user
space). In theory, with the scanning feature, you can leave that ...
scanning to the kernel and don't have to do any copying/allocate space
for it in user space etc.
PAGEMAP_SCAN doesn't have this issue and it was one of the reasons to
implement it.
Ack.
Thanks,
Andrei
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help