[PATCH 5/5] mm: add process_memwatch syscall documentation
From: Muhammad Usama Anjum <hidden>
Date: 2022-07-26 16:21:51
Also in:
linux-arch, linux-doc, linux-fsdevel, linux-kselftest, linux-perf-users, lkml
Subsystem:
documentation, memory management - misc, the rest · Maintainers:
Jonathan Corbet, Andrew Morton, David Hildenbrand, Linus Torvalds
Add the syscall with explanation of the operations. Signed-off-by: Muhammad Usama Anjum <redacted> --- Documentation/admin-guide/mm/soft-dirty.rst | 48 ++++++++++++++++++++- 1 file changed, 47 insertions(+), 1 deletion(-)
diff --git a/Documentation/admin-guide/mm/soft-dirty.rst b/Documentation/admin-guide/mm/soft-dirty.rst
index cb0cfd6672fa..030d75658010 100644
--- a/Documentation/admin-guide/mm/soft-dirty.rst
+++ b/Documentation/admin-guide/mm/soft-dirty.rst@@ -5,7 +5,12 @@ Soft-Dirty PTEs =============== The soft-dirty is a bit on a PTE which helps to track which pages a task -writes to. In order to do this tracking one should +writes to. + +Using Proc FS +------------- + +In order to do this tracking one should 1. Clear soft-dirty bits from the task's PTEs.
@@ -20,6 +25,47 @@ writes to. In order to do this tracking one should 64-bit qword is the soft-dirty one. If set, the respective PTE was written to since step 1. +Using System Call +----------------- + +process_memwatch system call can be used to find the dirty pages.:: + + long process_memwatch(int pidfd, unsigned long start, int len, + unsigned int flags, void *vec, int vec_len); + +The pidfd specifies the pidfd of process whose memory needs to be watched. +The calling process must have PTRACE_MODE_ATTACH_FSCREDS capabilities over +the process whose pidfd has been specified. It can be zero which means that +the process wants to watch its own memory. The operation is determined by +flags. The start argument must be a multiple of the system page size. The +len argument need not be a multiple of the page size, but since the +information is returned for the whole pages, len is effectively rounded +up to the next multiple of the page size. + +The vec is output array in which the offsets of the pages are returned. +Offset is calculated from start address. User lets the kernel know about the +size of the vec by passing size in vec_len. The system call returns when the +whole range has been searched or vec is completely filled. The whole range +isn't cleared if vec fills up completely. + +The flags argument specifies the operation to be performed. The MEMWATCH_SD_GET +and MEMWATCH_SD_CLEAR operations can be used separately or together to perform +MEMWATCH_SD_GET and MEMWATCH_SD_CLEAR atomically as one operation.:: + + MEMWATCH_SD_GET + Get the page offsets which are soft dirty. + + MEMWATCH_SD_CLEAR + Clear the pages which are soft dirty. + + MEMWATCH_SD_NO_REUSED_REGIONS + This optional flag can be specified in combination with other flags. + VM_SOFTDIRTY is ignored for the VMAs for performances reasons. This + flag shows only those pages dirty which have been written to by the + user. All new allocations aren't returned to be dirty. + +Explanation +----------- Internally, to do this tracking, the writable bit is cleared from PTEs when the soft-dirty bit is cleared. So, after this, when the task tries to
--
2.30.2