Thread (26 messages) 26 messages, 4 authors, 2025-10-17

Re: PROBLEM: userfaultfd REGISTER minor mode on MAP_PRIVATE range fails

From: James Houghton <hidden>
Date: 2025-09-15 20:25:16

On Mon, Sep 15, 2025 at 1:13 PM David P. Reed [off-list ref] wrote:

[1.] One line summary of the problem: userfaultfd REGISTER minor mode on MAP_PRIVATE fails
[2.] Full description of the problem/report:
The userfaultfd man page and the kernel docs seem to indicate that an area mapped
MAP_PRIVATE|MAP_ANONYMOUS can be registered to handle MINOR page faults on regular pages.
However, testing showed that not to work. MAP_SHARED does allow registration for MINOR
page fault events, though.
Either the documentation or the code should be fixed, IMO. Now reading the code that rejects
this case in the kernel source, the test in vma_can_userfault() that rejects this is this
line:
        if ((vm_flags & VM_UFFD_MINOR) &&
            (!is_vm_hugetlb_page(vma) && !vma_is_shmem(vma)))
                return false;
which probably should include !vma_is_anonymous(vma).

Or maybe the COW that might happen if the program were forked is something that can't be handled, which seems odd.
UFFDIO_CONTINUE, the resolution ioctl for userfaultfd minor faults,
doesn't have defined semantics for MAP_PRIVATE mappings. The
documentation is unclear that MAP_PRIVATE + userfaultfd minor faults
is invalid, but this is intentional behavior.

What would you like UFFDIO_CONTINUE on MAP_PRIVATE to do? Should it
populate a read-only PTE? Should it do CoW and populate a writable
PTE? I'm curious to hear more about your use case (and why UFFDIO_COPY
doesn't do what you want).
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help