Thread (7 messages) 7 messages, 2 authors, 2021-02-18

Re: [PATCH v3 1/1] process_madvise.2: Add process_madvise man page

From: Michael Kerrisk (man-pages) <hidden>
Date: 2021-02-13 22:05:27
Also in: linux-api, linux-man, linux-mm, lkml, selinux

Hello Suren,

On 2/2/21 11:12 PM, Suren Baghdasaryan wrote:
Hi Michael,

On Tue, Feb 2, 2021 at 2:45 AM Michael Kerrisk (man-pages)
[off-list ref] wrote:
quoted
Hello Suren (and Minchan and Michal)

Thank you for the revisions!

I've applied this patch, and done a few light edits.
Thanks!
quoted
However, I have a questions about undocumented pieces in *madvise(2)*,
as well as one other question. See below.

On 2/2/21 6:30 AM, Suren Baghdasaryan wrote:
quoted
Initial version of process_madvise(2) manual page. Initial text was
extracted from [1], amended after fix [2] and more details added using
man pages of madvise(2) and process_vm_read(2) as examples. It also
includes the changes to required permission proposed in [3].

[1] https://lore.kernel.org/patchwork/patch/1297933/
[2] https://lkml.org/lkml/2020/12/8/1282
[3] https://patchwork.kernel.org/project/selinux/patch/20210111170622.2613577-1-surenb@google.com/#23888311

Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Michal Hocko <mhocko@suse.com>
---
changes in v2:
- Changed description of MADV_COLD per Michal Hocko's suggestion
- Applied fixes suggested by Michael Kerrisk
changes in v3:
- Added Michal's Reviewed-by
- Applied additional fixes suggested by Michael Kerrisk

NAME
    process_madvise - give advice about use of memory to a process

SYNOPSIS
    #include <sys/uio.h>

    ssize_t process_madvise(int pidfd,
                           const struct iovec *iovec,
                           unsigned long vlen,
                           int advice,
                           unsigned int flags);

DESCRIPTION
    The process_madvise() system call is used to give advice or directions
    to the kernel about the address ranges of another process or the calling
    process. It provides the advice to the address ranges described by iovec
    and vlen. The goal of such advice is to improve system or application
    performance.

    The pidfd argument is a PID file descriptor (see pidfd_open(2)) that
    specifies the process to which the advice is to be applied.

    The pointer iovec points to an array of iovec structures, defined in
    <sys/uio.h> as:

    struct iovec {
        void  *iov_base;    /* Starting address */
        size_t iov_len;     /* Number of bytes to transfer */
    };

    The iovec structure describes address ranges beginning at iov_base address
    and with the size of iov_len bytes.

    The vlen represents the number of elements in the iovec structure.

    The advice argument is one of the values listed below.

  Linux-specific advice values
    The following Linux-specific advice values have no counterparts in the
    POSIX-specified posix_madvise(3), and may or may not have counterparts
    in the madvise(2) interface available on other implementations.

    MADV_COLD (since Linux 5.4.1)
I just noticed these version numbers now, and thought: they can't be
right (because the system call appeared only in v5.11). So I removed
them. But, of course in another sense the version numbers are (nearly)
right, since these advice values were added for madvise(2) in Linux 5.4.
However, they are not documented in the madvise(2) manual page. Is it
correct to assume that MADV_COLD and MADV_PAGEOUT have exactly the same
meaning in madvise(2) (but just for the calling process, of course)?
Correct. They should be added in the madvise(2) man page as well IMHO.
So, I decided to move the description of MADV_COLD and MADV_PAGEOUT
to madvise(2) and refer to that page from the process_madvise(2)
page. This avoids repeating the same information in two places.
quoted
quoted
        Deactive a given range of pages which will make them a more probable
I changed: s/Deactive/Deactivate/
thanks!
quoted
quoted
        reclaim target should there be a memory pressure. This is a
        nondestructive operation. The advice might be ignored for some pages
        in the range when it is not applicable.

    MADV_PAGEOUT (since Linux 5.4.1)
        Reclaim a given range of pages. This is done to free up memory occupied
        by these pages. If a page is anonymous it will be swapped out. If a
        page is file-backed and dirty it will be written back to the backing
        storage. The advice might be ignored for some pages in the range when
        it is not applicable.
[...]
quoted
    The hint might be applied to a part of iovec if one of its elements points
    to an invalid memory region in the remote process. No further elements will
    be processed beyond that point.
Is the above scenario the one that leads to the partial advice case described in
RETURN VALUE? If yes, perhaps I should add some words to make that clearer.
Correct. This describes the case when partial advice happens.
Thanks. I added a few words to clarify this.

quoted
You can see the light edits that I made in
https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/commit/?id=e3ce016472a1b3ec5dffdeb23c98b9fef618a97b
and following that I restructured DESCRIPTION a little in
https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/commit/?id=3aac0708a9acee5283e091461de6a8410bc921a6
The edits LGTM.
Thanks for checking them.

Cheers,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help