Re: [PATCH v3 1/1] process_madvise.2: Add process_madvise man page
From: Michael Kerrisk (man-pages) <hidden>
Date: 2021-02-13 22:05:27
Also in:
linux-api, linux-man, linux-mm, lkml, selinux
Hello Suren, On 2/2/21 11:12 PM, Suren Baghdasaryan wrote:
Hi Michael, On Tue, Feb 2, 2021 at 2:45 AM Michael Kerrisk (man-pages) [off-list ref] wrote:quoted
Hello Suren (and Minchan and Michal) Thank you for the revisions! I've applied this patch, and done a few light edits.Thanks!quoted
However, I have a questions about undocumented pieces in *madvise(2)*, as well as one other question. See below. On 2/2/21 6:30 AM, Suren Baghdasaryan wrote:quoted
Initial version of process_madvise(2) manual page. Initial text was extracted from [1], amended after fix [2] and more details added using man pages of madvise(2) and process_vm_read(2) as examples. It also includes the changes to required permission proposed in [3]. [1] https://lore.kernel.org/patchwork/patch/1297933/ [2] https://lkml.org/lkml/2020/12/8/1282 [3] https://patchwork.kernel.org/project/selinux/patch/20210111170622.2613577-1-surenb@google.com/#23888311 Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: Michal Hocko <mhocko@suse.com> --- changes in v2: - Changed description of MADV_COLD per Michal Hocko's suggestion - Applied fixes suggested by Michael Kerrisk changes in v3: - Added Michal's Reviewed-by - Applied additional fixes suggested by Michael Kerrisk NAME process_madvise - give advice about use of memory to a process SYNOPSIS #include <sys/uio.h> ssize_t process_madvise(int pidfd, const struct iovec *iovec, unsigned long vlen, int advice, unsigned int flags); DESCRIPTION The process_madvise() system call is used to give advice or directions to the kernel about the address ranges of another process or the calling process. It provides the advice to the address ranges described by iovec and vlen. The goal of such advice is to improve system or application performance. The pidfd argument is a PID file descriptor (see pidfd_open(2)) that specifies the process to which the advice is to be applied. The pointer iovec points to an array of iovec structures, defined in <sys/uio.h> as: struct iovec { void *iov_base; /* Starting address */ size_t iov_len; /* Number of bytes to transfer */ }; The iovec structure describes address ranges beginning at iov_base address and with the size of iov_len bytes. The vlen represents the number of elements in the iovec structure. The advice argument is one of the values listed below. Linux-specific advice values The following Linux-specific advice values have no counterparts in the POSIX-specified posix_madvise(3), and may or may not have counterparts in the madvise(2) interface available on other implementations. MADV_COLD (since Linux 5.4.1)I just noticed these version numbers now, and thought: they can't be right (because the system call appeared only in v5.11). So I removed them. But, of course in another sense the version numbers are (nearly) right, since these advice values were added for madvise(2) in Linux 5.4. However, they are not documented in the madvise(2) manual page. Is it correct to assume that MADV_COLD and MADV_PAGEOUT have exactly the same meaning in madvise(2) (but just for the calling process, of course)?Correct. They should be added in the madvise(2) man page as well IMHO.
So, I decided to move the description of MADV_COLD and MADV_PAGEOUT to madvise(2) and refer to that page from the process_madvise(2) page. This avoids repeating the same information in two places.
quoted
quoted
Deactive a given range of pages which will make them a more probableI changed: s/Deactive/Deactivate/thanks!quoted
quoted
reclaim target should there be a memory pressure. This is a nondestructive operation. The advice might be ignored for some pages in the range when it is not applicable. MADV_PAGEOUT (since Linux 5.4.1) Reclaim a given range of pages. This is done to free up memory occupied by these pages. If a page is anonymous it will be swapped out. If a page is file-backed and dirty it will be written back to the backing storage. The advice might be ignored for some pages in the range when it is not applicable.[...]quoted
The hint might be applied to a part of iovec if one of its elements points to an invalid memory region in the remote process. No further elements will be processed beyond that point.Is the above scenario the one that leads to the partial advice case described in RETURN VALUE? If yes, perhaps I should add some words to make that clearer.Correct. This describes the case when partial advice happens.
Thanks. I added a few words to clarify this.
quoted
You can see the light edits that I made in https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/commit/?id=e3ce016472a1b3ec5dffdeb23c98b9fef618a97b and following that I restructured DESCRIPTION a little in https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/commit/?id=3aac0708a9acee5283e091461de6a8410bc921a6The edits LGTM.
Thanks for checking them. Cheers, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/