Thread (11 messages) 11 messages, 5 authors, 2020-11-17

Re: [PATCH v9 3/3] mm/madvise: introduce process_madvise() syscall: an external memory hinting API

From: Michal Hocko <mhocko@suse.com>
Date: 2020-09-21 07:14:15
Also in: linux-man, linux-mm, lkml

On Mon 21-09-20 07:56:33, Christoph Hellwig wrote:
On Mon, Aug 31, 2020 at 05:06:33PM -0700, Minchan Kim wrote:
quoted
There is usecase that System Management Software(SMS) want to give a
memory hint like MADV_[COLD|PAGEEOUT] to other processes and in the
case of Android, it is the ActivityManagerService.

The information required to make the reclaim decision is not known to
the app.  Instead, it is known to the centralized userspace
daemon(ActivityManagerService), and that daemon must be able to
initiate reclaim on its own without any app involvement.

To solve the issue, this patch introduces a new syscall process_madvise(2).
It uses pidfd of an external process to give the hint. It also supports
vector address range because Android app has thousands of vmas due to
zygote so it's totally waste of CPU and power if we should call the
syscall one by one for each vma.(With testing 2000-vma syscall vs
1-vector syscall, it showed 15% performance improvement.  I think it
would be bigger in real practice because the testing ran very cache
friendly environment).
I'm really not sure this syscall is a good idea.  If you want central
control you should implement an IPC mechanisms that allows your
supervisor daemon to tell the application to perform the madvice
instead of forcing the behavior on it.
Even though I am not entirely happy about the interface [1]. As it seems
I am in minority in my concern I backed off and decided to not block this
work because I do not see the problem with the functionality itself. And
I find it very useful for userspace driven memory management people are
asking for a long time.

This functionality shouldn't be much different from the standard memory
reclaim. It has some limitations (e.g. it can only handle mapped memory)
but allows to pro-actively swap out or reclaim disk based memory based
on a specific knowlege of the workload. Kernel is not able to do the
same.

[1] http://lkml.kernel.org/r/20200117115225.GV19428@dhcp22.suse.cz
-- 
Michal Hocko
SUSE Labs
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help